All Versions
17
Latest Version
Avg Release Cycle
107 days
Latest Release
1457 days ago
Changelog History
Page 2
Changelog History
Page 2
-
v0.3.3 Changes
August 30, 2018common:
- thread memory allocation has been switched back to the method
๐ used before version 0.3.1 due to unexpected problems caused by
the new code under some circumstances. A new compile-time option
USE_TLS has been added to allow enabling the new code instead ,
0๏ธโฃ and it is hoped that this can become the default again in the next version. - ๐ LAPACK PR272 has been integrated, which fixes spurious errors
in DSYEVR and related functions caused by missing conversion
from ILAENV to ILAENV_2STAGE in several _2stage routines. - the cmake-generated OpenBLASConfig.cmake now uses correct case
for the name of the library - โ added support for Haiku OS
x86_64:
- โ added AVX512 implementations of SDOT, DDOT, SAXPY, DAXPY,
DSCAL, DGEMVN and DSYMVL - โ added a workaround for a cygwin issue that prevented compilation
of AVX512 code
IBM Z:
- โ added autodetection of Z14
- ๐ fixed TRMM errors in the generic target
- thread memory allocation has been switched back to the method
-
v0.3.2 Changes
July 30, 2018 -
v0.3.1 Changes
July 01, 2018common:
- rewritten thread initialization code with significantly reduced overhead
- โ added CBLAS interfaces to the IxAMIN BLAS extension functions
- ๐ fixed the lapack-test target
- ๐ CMAKE builds now create an OpenBLASConfig.cmake file
- ZAXPY now uses a single thread for small input sizes
- โก๏ธ the LAPACK code was updated from Reference-LAPACK/lapack#253
POWER:
- corrected CROT and ZROT behaviour with zero INC_X
ARMV7:
- corrected xDOT behaviour with zero INC_X or INC_Y
x86_64:
- ๐ retired some older targets of DYNAMIC_ARCH builds to a new option DYNAMIC_OLDER,
this affects PENRYN,DUNNINGTON,OPTERON,OPTERON_SSE3,BOBCAT,ATOM and NANO
๐ (which will still be supported via the slower PRESCOTT kernels when this option is not set) - added an option DYNAMIC_LIST that (used in conjunction with DYNAMIC_ARCH) allows
๐ to specify the list of x86_64 targets to include. Any target not on the list will be supported by
the Sandybridge or Nehalem kernels if available, or by Prescott. - ๐ improved SWITCH_RATIO on Haswell for increased GEMM throughput
- โ added initial support for Intel Skylake X, including an AVX512 SGEMM kernel
- โ added autodetection of Intel Cannon Lake series as Skylake X
- โ added a default L2 cache size for hypervisors that return zero here (Chromebook)
- ๐ fixed a name clash with recent Windows10 headers that broke the build with (at least)
recent mingw from MSYS2 - ๐ fixed a link error in mixed clang/gfortran builds with OpenMP
- ๐ updated the OSX deployment target to 10.8
- ๐ switched on parallel make for builds on MS Windows by default
x86:
- fixed SSWAP and DSWAP behaviour with zero INC_X and INC_Y
-
v0.3.0 Changes
May 23, 2018common:
* fixed some more thread race and locking bugs * added preliminary support for calling an OpenMP build of the library from multiple threads * removed performance impact of thread locks added in 0.2.20 on OpenMP code * general code cleanup * optimized DSDOT implementation * improved thread distribution for GEMM * corrected IMATCOPY/OMATCOPY implementation * fixed out-of-bounds accesses in the multithreaded xBMV/xPMV and SYMV implementations * cmake build improvements * pkgconfig file now contains build options * openblas_get_config() now reports USE_OPENMP and NUM_THREADS settings used for the build * corrections and improvements for systems with more than 64 cpus * LAPACK code updated to 3.8.0 including later fixes * added ReLAPACK, a recursive implementation of several LAPACK functions * Rewrote ROTMG to handle cases that the netlib code failed to address * Disabled (broken) multithreading code for xTRMV * corrected prototypes of complex CBLAS functions to make our cblas.h match the generally accepted standard * shared memory access failures on startup are now handled more gracefully * restored utests from earlier releases (and made them pass on all affected systems)
SPARC:
* several fixes for cpu autodetection
POWER:
* corrected vector register overwriting in several Power8 kernels * optimized additional BLAS functions
ARM:
* added support for CortexA53 and A72 * added autodetection for ThunderX2T99 * made most optimized kernels the default for generic ARMv8 targets
x86_64:
* parallelized DDOT kernel for Haswell * changed alignment directives in assembly kernels to boost performance on OSX * fixed register handling in the GEMV microkernels (bug exposed by gcc7) * added support for building on OpenBSD and Dragonfly * updated compiler options to work with Intel release 2018 * support fully optimized build with clang/flang on Microsoft Windows * fixed building on AIX
IBM Z:
* added optimized BLAS 1/2 functions
MIPS:
* fixed cpu autodetection helper code * added mips32 1004K cpu (Mediatek MT7621 and similar SoC) * added mips64 I6500 cpu
-
v0.2.20 Changes
July 24, 2017๐ Version 0.2.20
24-Jul-2017common:
* Improved CMake support * Fixed several thread race and locking bugs * Fixed default LAPACK optimization level * Updated LAPACK to 3.7.0 * Added ReLAPACK (https://github.com/HPAC/ReLAPACK), make BUILD_RELAPACK=1
POWER:
* Optimizations for Power9 * Fixed several Power8 assembly bugs
ARM:
* New optimized Vulcan and ThunderX2T99 targets * Support for ARMV7 SOFT_FP ABI (make ARM_SOFTFP_ABI=1) * Detect all cpu cores including offline ones * Fix compilation with CLANG * Support building a shared library for Android
MIPS:
* Fixed several threading issues * Fix compilation with CLANG
x86_64:
* Detect Intel Bay Trail and Apollo Lake * Detect Intel Sky Lake and Kaby Lake * Detect Intel Knights Landing * Detect AMD A8, A10, A12 and Ryzen * Support 64bit builds with Visual Studio * Fix building with Intel and PGI compilers * Fix building with MINGW and TDM-GCC * Fix cmake builds for Haswell and related cpus * Fix building for Sandybridge with CLANG 3.9 * Add support for the FLANG compiler
IBM Z:
* New target z13 with BLAS3 optimizations
[](https://sourceforge.net/projects/openblas/files/v0.2.20/OpenBLAS 0.2.20 version.zip/download)
-
v0.2.19 Changes
September 01, 2016๐ Version 0.2.19
1-Sep-2016common:
* Improved cross compiling. * Fix the bug on musl libc.
POWER:
* Optimize BLAS on Power8 * Fixed Julia+OpenBLAS bugs on Power8
MIPS:
* Optimize BLAS on MIPS P5600 and I6400 (Thanks, Shivraj Patil, Kaustubh Raste)
ARM:
* Improved on ARM Cortex-A57. (Thanks, Ashwin Sekhar T K)
[](https://sourceforge.net/projects/openblas/files/v0.2.19/OpenBLAS 0.2.19 version.zip/download)
-
v0.2.18 Changes
April 12, 2016๐ Version 0.2.18
12-Apr-2016common:
- If you set MAKE_NB_JOBS flag less or equal than zero, make will be without -j.
x86/x86_64:
- ๐ Support building Visual Studio static library. (#813, Thanks, theoractice)
- ๐ Fix bugs to pass buidbot CI tests (http://build.openblas.net)
ARM:
- Provide DGEMM 8x4 kernel for Cortex-A57 (Thanks, Ashwin Sekhar T K)
POWER:
- โก๏ธ Optimize S and C BLAS3 on Power8
- โก๏ธ Optimize BLAS2/1 on Power8
[](https://sourceforge.net/projects/openblas/files/v0.2.18/OpenBLAS 0.2.18 version.zip/download)