ArrayFire v3.4.1 release notes (2016-10-15)

« Changelog History

ArrayFire v3.4.1 Release Notes

Release Date: 2016-10-15 // over 7 years ago

v3.4.1

The source code with submodules can be downloaded directly from the following link:
http://arrayfire.com/arrayfire_source/arrayfire-full-3.4.1.tar.bz2

Installer CUDA Version: 8.0 (Required)
Installer OpenCL Version: 1.2 (Minimum)

Installers
- 🐧 Installers for Linux, OS X and Windows
  - CUDA backend now uses CUDA 8.0.
  - Uses Intel MKL 2017.
  - CUDA Compute 2.x (Fermi) is no longer compiled into the library.
- Installer for OS X
  - The libraries shipping in the OS X Installer are now compiled with Apple
    Clang v7.3.1 (previouly v6.1.0).
  - The OS X version used is 10.11.6 (previously 10.10.5).
- Installer for Jetson TX1 / Tegra X1
  - Requires JetPack for L4T 2.3
    🐧 (containing Linux for Tegra r24.2 for TX1).
  - CUDA backend now uses CUDA 8.0 64-bit.
  - Using CUDA's cusolver instead of CPU fallback.
  - Uses OpenBLAS for CPU BLAS.
  - All ArrayFire libraries are now 64-bit.
👌 Improvements
- ➕ Add sparse array support to af::eval().
  1
- ➕ Add OpenCL-CPU fallback support for sparse af::matmul() when running on
  📜 a unified memory device. Uses MKL Sparse BLAS.
- When using CUDA libdevice, pick the correct compute version based on device.
  1
- 👍 OpenCL FFT now also supports prime factors 7, 11 and 13.
  1
  2
🐛 Bug Fixes
- 👍 Allow CUDA libdevice to be detected from custom directory.
- 🛠 Fix aarch64 detection on Jetson TX1 64-bit OS.
  1
- Add missing definition of af_set_fft_plan_cache_size in unified backend.
  1
- 🛠 Fix intial values for af::min() and af::max() operations.
  1
  2
- 🛠 Fix distance calculation in af::nearestNeighbour for CUDA and OpenCL backend.
  1
  2
- 🛠 Fix OpenCL bug where scalars where are passed incorrectly to compile options.
  1
- 🛠 Fix bug in af::Window::surface() with respect to dimensions and ranges.
  1
- Fix possible double free corruption in af_assign_seq().
  1
- ➕ Add missing eval for key in af::scanByKey in CPU backend.
  1
- Fixed creation of sparse values array using AF_STORAGE_COO.
  1
  1
Examples
- ➕ Add a Conjugate Gradient solver example
  📜 to demonstrate sparse and dense matrix operations.
  1
CUDA Backend
- When using CUDA 8.0,
  0️⃣ compute 2.x are no longer in default compute list.
  - This follows CUDA 8.0
    🗄 deprecating computes 2.x.
  - Default computes for CUDA 8.0 will be 30, 50, 60.
- 0️⃣ When using CUDA pre-8.0, the default selection remains 20, 30, 50.
- 0️⃣ CUDA backend now uses -arch=sm_30 for PTX compilation as default.
  - Unless compute 2.0 is enabled.
Known Issues
- af::lu() on CPU is known to give incorrect results when built run on
  OS X 10.11 or 10.12 and compiled with Accelerate Framework.
  1
  - Since the OS X Installer libraries uses MKL rather than Accelerate
    Framework, this issue does not affect those libraries.