All Versions
17
Latest Version
Avg Release Cycle
107 days
Latest Release
1457 days ago

Changelog History
Page 2

  • v0.3.3 Changes

    August 30, 2018

    common:

    • thread memory allocation has been switched back to the method
      ๐Ÿ‘‰ used before version 0.3.1
      due to unexpected problems caused by
      the new code under some circumstances. A new compile-time option
      USE_TLS has been added to allow enabling the new code instead
      ,
      0๏ธโƒฃ and it is hoped that this can become the default again in the next version.
    • ๐Ÿ›  LAPACK PR272 has been integrated, which fixes spurious errors
      in DSYEVR and related functions caused by missing conversion
      from ILAENV to ILAENV_2STAGE in several _2stage routines.
    • the cmake-generated OpenBLASConfig.cmake now uses correct case
      for the name of the library
    • โž• added support for Haiku OS

    x86_64:

    • โž• added AVX512 implementations of SDOT, DDOT, SAXPY, DAXPY,
      DSCAL, DGEMVN and DSYMVL
    • โž• added a workaround for a cygwin issue that prevented compilation
      of AVX512 code

    IBM Z:

    • โž• added autodetection of Z14
    • ๐Ÿ›  fixed TRMM errors in the generic target

    Download OpenBLAS

  • v0.3.2 Changes

    July 30, 2018

    common:

    • ๐Ÿ›  fixes for regressions caused by the rewrite of the thread initialization code in 0.3.1

    x86_64:

    • โž• added autodetection of AMD Ryzen 2
    • ๐Ÿ›  fixed build with older versions of MSVC

    Power:

    • ๐Ÿ›  fixed cpu autodetection for the BSDs

    mips64:

    • ๐Ÿ›  fixed utest errors in AXPY, DSDOT, ROT and SWAP

    Download OpenBLAS

  • v0.3.1 Changes

    July 01, 2018

    common:

    • rewritten thread initialization code with significantly reduced overhead
    • โž• added CBLAS interfaces to the IxAMIN BLAS extension functions
    • ๐Ÿ›  fixed the lapack-test target
    • ๐Ÿ— CMAKE builds now create an OpenBLASConfig.cmake file
    • ZAXPY now uses a single thread for small input sizes
    • โšก๏ธ the LAPACK code was updated from Reference-LAPACK/lapack#253

    POWER:

    • corrected CROT and ZROT behaviour with zero INC_X

    ARMV7:

    • corrected xDOT behaviour with zero INC_X or INC_Y

    x86_64:

    • ๐Ÿ— retired some older targets of DYNAMIC_ARCH builds to a new option DYNAMIC_OLDER,
      this affects PENRYN,DUNNINGTON,OPTERON,OPTERON_SSE3,BOBCAT,ATOM and NANO
      ๐Ÿ‘ (which will still be supported via the slower PRESCOTT kernels when this option is not set)
    • added an option DYNAMIC_LIST that (used in conjunction with DYNAMIC_ARCH) allows
      ๐Ÿ‘ to specify the list of x86_64 targets to include. Any target not on the list will be supported by
      the Sandybridge or Nehalem kernels if available, or by Prescott.
    • ๐Ÿ‘Œ improved SWITCH_RATIO on Haswell for increased GEMM throughput
    • โž• added initial support for Intel Skylake X, including an AVX512 SGEMM kernel
    • โž• added autodetection of Intel Cannon Lake series as Skylake X
    • โž• added a default L2 cache size for hypervisors that return zero here (Chromebook)
    • ๐Ÿ›  fixed a name clash with recent Windows10 headers that broke the build with (at least)
      recent mingw from MSYS2
    • ๐Ÿ›  fixed a link error in mixed clang/gfortran builds with OpenMP
    • ๐Ÿš€ updated the OSX deployment target to 10.8
    • ๐Ÿ switched on parallel make for builds on MS Windows by default

    x86:

    • fixed SSWAP and DSWAP behaviour with zero INC_X and INC_Y

    Download OpenBLAS

  • v0.3.0 Changes

    May 23, 2018

    common:

    * fixed some more thread race and locking bugs
    * added preliminary support for calling an OpenMP build of the library from multiple threads
    * removed performance impact of thread locks added in 0.2.20 on OpenMP code
    * general code cleanup 
    * optimized DSDOT implementation
    * improved thread distribution for GEMM
    * corrected IMATCOPY/OMATCOPY implementation
    * fixed out-of-bounds accesses in the multithreaded xBMV/xPMV and SYMV implementations
    * cmake build improvements
    * pkgconfig file now contains build options
    * openblas_get_config() now reports USE_OPENMP and NUM_THREADS settings used for the build
    * corrections and improvements for systems with more than 64 cpus
    * LAPACK code updated to 3.8.0 including later fixes
    * added ReLAPACK, a recursive implementation of several LAPACK functions
    * Rewrote ROTMG to handle cases that the netlib code failed to address
    * Disabled (broken) multithreading code for xTRMV
    * corrected prototypes of complex CBLAS functions to make our cblas.h match the generally accepted standard
    * shared memory access failures on startup are now handled more gracefully
    * restored utests from earlier releases (and made them pass on all affected systems)
    

    SPARC:

    * several fixes for cpu autodetection
    

    POWER:

    * corrected vector register overwriting in several Power8 kernels
    * optimized additional BLAS functions
    

    ARM:

    * added support for CortexA53 and A72 
    * added autodetection for ThunderX2T99
    * made most optimized kernels the default for generic ARMv8 targets 
    

    x86_64:

    * parallelized DDOT kernel for Haswell
    * changed alignment directives in assembly kernels to boost performance on OSX
    * fixed register handling in the GEMV microkernels (bug exposed by gcc7)
    * added support for building on OpenBSD and Dragonfly 
    * updated compiler options to work with Intel release 2018
    * support fully optimized build with clang/flang on Microsoft Windows
    * fixed building on AIX
    

    IBM Z:

    * added optimized BLAS 1/2 functions
    

    MIPS:

    * fixed cpu autodetection helper code
    * added mips32 1004K cpu (Mediatek MT7621 and similar SoC)
    * added mips64 I6500 cpu
    

    Download OpenBLAS

  • v0.2.20 Changes

    July 24, 2017

    ๐Ÿ”– Version 0.2.20
    24-Jul-2017

    common:

        * Improved CMake support
        * Fixed several thread race and locking bugs
        * Fixed default LAPACK optimization level
        * Updated LAPACK to 3.7.0
        * Added ReLAPACK (https://github.com/HPAC/ReLAPACK), make BUILD_RELAPACK=1
    

    POWER:

        * Optimizations for Power9
        * Fixed several Power8 assembly bugs
    

    ARM:

        * New optimized Vulcan and ThunderX2T99 targets
        * Support for ARMV7 SOFT_FP ABI (make ARM_SOFTFP_ABI=1)
        * Detect all cpu cores including offline ones
        * Fix compilation with CLANG
        * Support building a shared library for Android
    

    MIPS:

        * Fixed several threading issues
        * Fix compilation with CLANG
    

    x86_64:

        * Detect Intel Bay Trail and Apollo Lake
        * Detect Intel Sky Lake and Kaby Lake
        * Detect Intel Knights Landing
        * Detect AMD A8, A10, A12 and Ryzen
        * Support 64bit builds with Visual Studio
        * Fix building with Intel and PGI compilers
        * Fix building with MINGW and TDM-GCC
        * Fix cmake builds for Haswell and related cpus
        * Fix building for Sandybridge with CLANG 3.9
        * Add support for the FLANG compiler
    

    IBM Z:

        * New target z13 with BLAS3 optimizations
    

    [Download OpenBLAS](https://sourceforge.net/projects/openblas/files/v0.2.20/OpenBLAS 0.2.20 version.zip/download)

  • v0.2.19 Changes

    September 01, 2016

    ๐Ÿ”– Version 0.2.19
    1-Sep-2016

    common:

        * Improved cross compiling.
        * Fix the bug on musl libc.
    

    POWER:

        * Optimize BLAS on Power8
        * Fixed Julia+OpenBLAS bugs on Power8
    

    MIPS:

        * Optimize BLAS on MIPS P5600 and I6400 (Thanks, Shivraj Patil, Kaustubh Raste)
    

    ARM:

        * Improved on ARM Cortex-A57. (Thanks, Ashwin Sekhar T K)
    

    [Download OpenBLAS](https://sourceforge.net/projects/openblas/files/v0.2.19/OpenBLAS 0.2.19 version.zip/download)

  • v0.2.18 Changes

    April 12, 2016

    ๐Ÿ”– Version 0.2.18
    12-Apr-2016

    common:

    • If you set MAKE_NB_JOBS flag less or equal than zero, make will be without -j.

    x86/x86_64:

    • ๐Ÿ‘Œ Support building Visual Studio static library. (#813, Thanks, theoractice)
    • ๐Ÿ›  Fix bugs to pass buidbot CI tests (http://build.openblas.net)

    ARM:

    • Provide DGEMM 8x4 kernel for Cortex-A57 (Thanks, Ashwin Sekhar T K)

    POWER:

    • โšก๏ธ Optimize S and C BLAS3 on Power8
    • โšก๏ธ Optimize BLAS2/1 on Power8

    [Download OpenBLAS](https://sourceforge.net/projects/openblas/files/v0.2.18/OpenBLAS 0.2.18 version.zip/download)