OpenBLAS v0.3.7 Release Notes

Release Date: 2019-08-11 // over 4 years ago
  • common:

    • having the gmake special variables TARGET_ARCH or TARGET_MACH defined no longer causes build failures in ctest or utest
    • defining NO_AFFINITY or USE_TLS to zero in gmake builds no longer has the same effect as setting them to one
    • โœ… a new test program was added to allow checking the library for thread safety
    • a new option USE_LOCKING was added to ensure thread safety when OpenBLAS itself is built without multithreading but
      will be called from multiple threads.
    • ๐Ÿง a build failure on Linux with glibc versions earlier than 2.5 was fixed
    • ๐Ÿ›  a runtime error with CPU enumeration (and NO_AFFINITY not set) on glibc 2.6 was fixed
    • ๐Ÿง NO_AFFINITY was added to the CMAKE options (and defaults to being active on Linux, as in the gmake builds)

    x86_64

    • ๐Ÿ— the build-time logic for detection of AVX512 availability in the processor and compiler was fixed
    • ๐Ÿ— gmake builds on OSX now set the internal name of the library to libopenblas.0.dylib (consistent with CMAKE)
    • the Haswell DGEMM kernel received a significant speedup through improved prefetch and load instructions
    • ๐ŸŽ performance of DGEMM, DTRMM, DTRSM and ZDOT on Zen/Zen2 was markedly increased by avoiding vpermpd instructions
    • the SKYLAKEX (AVX512) DGEMM helper functions have now been disabled to fix remaining errors in DGEMM, DSYMM and DTRMM

    POWER:

    • โž• added support for building on FreeBSD/powerpc64 and FreeBSD/ppc970
    • โž• added optimized kernels for POWER9 single and double precision complex BLAS3
    • โž• added optimized kernels for POWER9 SGEMM and STRMM

    ARMV7:

    • ๐Ÿ›  fixed the softfp implementations of xAMAX and IxAMAX
    • โœ‚ removed the predefined -march= flags on both ARMV5 and ARMV6 as they were appropriate for only a subset of platforms

    Download OpenBLAS