ArrayFire v3.4.1 Release Notes
Release Date: 2016-10-15 // over 7 years ago-
v3.4.1
The source code with submodules can be downloaded directly from the following link:
http://arrayfire.com/arrayfire_source/arrayfire-full-3.4.1.tar.bz2Installer CUDA Version: 8.0 (Required)
Installer OpenCL Version: 1.2 (Minimum)Installers
- ๐ง Installers for Linux, OS X and Windows
- CUDA backend now uses CUDA 8.0.
- Uses Intel MKL 2017.
- CUDA Compute 2.x (Fermi) is no longer compiled into the library.
- Installer for OS X
- The libraries shipping in the OS X Installer are now compiled with Apple
Clang v7.3.1 (previouly v6.1.0). - The OS X version used is 10.11.6 (previously 10.10.5).
- The libraries shipping in the OS X Installer are now compiled with Apple
- Installer for Jetson TX1 / Tegra X1
- Requires JetPack for L4T 2.3
๐ง (containing Linux for Tegra r24.2 for TX1). - CUDA backend now uses CUDA 8.0 64-bit.
- Using CUDA's cusolver instead of CPU fallback.
- Uses OpenBLAS for CPU BLAS.
- All ArrayFire libraries are now 64-bit.
- Requires JetPack for L4T 2.3
๐ Improvements
- โ Add sparse array support to af::eval().
1 - โ Add OpenCL-CPU fallback support for sparse af::matmul() when running on
๐ a unified memory device. Uses MKL Sparse BLAS. - When using CUDA libdevice, pick the correct compute version based on device.
1 - ๐ OpenCL FFT now also supports prime factors 7, 11 and 13.
1
2
๐ Bug Fixes
- ๐ Allow CUDA libdevice to be detected from custom directory.
- ๐ Fix
aarch64
detection on Jetson TX1 64-bit OS.
1 - Add missing definition of
af_set_fft_plan_cache_size
in unified backend.
1 - ๐ Fix intial values for af::min() and af::max() operations.
1
2 - ๐ Fix distance calculation in af::nearestNeighbour for CUDA and OpenCL backend.
1
2 - ๐ Fix OpenCL bug where scalars where are passed incorrectly to compile options.
1 - ๐ Fix bug in af::Window::surface() with respect to dimensions and ranges.
1 - Fix possible double free corruption in af_assign_seq().
1 - โ Add missing eval for key in af::scanByKey in CPU backend.
1 - Fixed creation of sparse values array using AF_STORAGE_COO.
1
1
Examples
- โ Add a Conjugate Gradient solver example
๐ to demonstrate sparse and dense matrix operations.
1
CUDA Backend
- When using CUDA 8.0,
0๏ธโฃ compute 2.x are no longer in default compute list.- This follows CUDA 8.0
๐ deprecating computes 2.x. - Default computes for CUDA 8.0 will be 30, 50, 60.
- This follows CUDA 8.0
- 0๏ธโฃ When using CUDA pre-8.0, the default selection remains 20, 30, 50.
- 0๏ธโฃ CUDA backend now uses
-arch=sm_30
for PTX compilation as default.- Unless compute 2.0 is enabled.
Known Issues
- af::lu() on CPU is known to give incorrect results when built run on
OS X 10.11 or 10.12 and compiled with Accelerate Framework.
1- Since the OS X Installer libraries uses MKL rather than Accelerate
Framework, this issue does not affect those libraries.
- Since the OS X Installer libraries uses MKL rather than Accelerate
- ๐ง Installers for Linux, OS X and Windows