ArrayFire v3.4.0 Release Notes
Release Date: 2016-09-13 // over 7 years ago-
v3.4.0
The source code with submodules can be downloaded directly from the following link:
http://arrayfire.com/arrayfire_source/arrayfire-full-3.4.0.tar.bz2Installer CUDA Version: 7.5 (Required)
Installer OpenCL Version: 1.2 (Minimum)โก๏ธ Major Updates
- ๐ [Sparse Matrix and BLAS](ref sparse_func). 1 2
- Faster JIT for CUDA and OpenCL. 1 2
- ๐ Support for [random number generator engines](ref af::randomEngine).
1 2 - ๐ Improvements to graphics. 1 2
๐ Features
- ๐ [Sparse Matrix and BLAS](ref sparse_func) 1
2
- Support for [CSR](ref AF_STORAGE_CSR) and [COO](ref AF_STORAGE_COO)
[storage types](ref af_storage). - Sparse-Dense Matrix Multiplication and Matrix-Vector Multiplication as a
part of af::matmul() using AF_STORAGE_CSR format for sparse. - Conversion to and from [dense](ref AF_STORAGE_DENSE) matrix to [CSR](ref AF_STORAGE_CSR)
and [COO](ref AF_STORAGE_COO) [storage types](ref af_storage).
- Support for [CSR](ref AF_STORAGE_CSR) and [COO](ref AF_STORAGE_COO)
- Faster JIT 1
2
- Performance improvements for CUDA and OpenCL JIT functions.
- Support for evaluating multiple outputs in a single kernel. See af::array::eval() for more.
- [Random Number Generation](ref af::randomEngine)
1 2- af::randomEngine(): A random engine class to handle setting the type and seed
for random number generator engines. - Supported engine types are:
- Philox
- Threefry
- Mersenne Twister
- af::randomEngine(): A random engine class to handle setting the type and seed
- Graphics 1
2
- Using Forge v0.9.0
- [Vector Field](ref af::Window::vectorField) plotting functionality.
1 - Removed GLEW and replaced with glbinding.
- Removed usage of GLEW after support for MX (multithreaded) was dropped in v2.0.
1 - Multiple overlays on the same window are now possible.
- Overlays support for same type of object (2D/3D)
- Supported by af::Window::plot, af::Window::hist, af::Window::surface,
af::Window::vectorField. - New API to set axes limits for graphs.
- Draw calls do not automatically compute the limits. This is now under user control.
- af::Window::setAxesLimits can be used to set axes limits automatically or manually.
- af::Window::setAxesTitles can be used to set axes titles.
- New API for plot and scatter:
- af::Window::plot() and af::Window::scatter() now can handle 2D and 3D and determine appropriate order.
- af_draw_plot_nd()
- af_draw_plot_2d()
- af_draw_plot_3d()
- af_draw_scatter_nd()
- af_draw_scatter_2d()
- af_draw_scatter_3d()
- ๐ New [interpolation methods](ref af_interp_type)
1- Applies to
- af::resize()
- af::transform()
- af::approx1()
- af::approx2()
- ๐ Support for [complex mathematical functions](ref mathfunc_mat)
1- Add complex support for trig_mat, af::sqrt(), af::log().
- ๐ฆ af::medfilt1(): Median filter for 1-d signals 1
- Generalized scan functions: scan_func_scan and scan_func_scanbykey
- Now supports inclusive or exclusive scans
- Supports binary operations defined by af_binary_op.
1
- [Image Moments](ref moments_mat) functions
1 - โ Add af::getSizeOf() function for af_dtype
1 - Explicitly extantiate af::array::device() for `void *
1
๐ Bug Fixes
- ๐ Fixes to edge-cases in morph_mat. 1
- ๐ Makes JIT tree size consistent between devices. 1
- Delegate higher-dimension in convolve_mat to correct dimensions. 1
- Indexing fixes with C++11. 1 2
- ๐ Handle empty arrays as inputs in various functions. 1
- ๐ Fix bug when single element input to af::median. 1
- ๐ Fix bug in calculation of time from af::timeit(). 1
- ๐ Fix bug in floating point numbers in af::seq. 1
- ๐ Fixes for OpenCL graphics interop on NVIDIA devices.
1 - ๐ Fix bug when compiling large kernels for AMD devices.
1 - ๐ Fix bug in af::bilateral when shared memory is over the limit.
1 - ๐ Fix bug in kernel header compilation tool
bin2cpp
.
1 - ๐ Fix inital values for morph_mat functions.
1 - ๐ Fix bugs in af::homography() CPU and OpenCL kernels.
1 - ๐ Fix bug in CPU TNJ.
1
๐ Improvements
- CUDA 8 and compute 6.x(Pascal) support, current installer ships with CUDA 7.5. 1 2 3
- ๐ User controlled FFT plan caching. 1
- CUDA performance improvements for image_func_wrap, image_func_unwrap and approx_mat.
1 - ๐ Fallback for CUDA-OpenGL interop when no devices does not support OpenGL.
1 - Additional forms of batching with the transform_func_transform functions.
New behavior defined here.
1 - โก๏ธ Update to OpenCL2 headers. 1
- ๐ Support for integration with external OpenCL contexts. 1
- ๐ Performance improvements to interal copy in CPU Backend.
1 - ๐ Performance improvements to af::select and af::replace CUDA kernels.
1 - 0๏ธโฃ Enable OpenCL-CPU offload by default for devices with Unified Host Memory.
1- To disable, use the environment variable
AF_OPENCL_CPU_OFFLOAD=0
.
- To disable, use the environment variable
๐ Build
- Compilation speedups. 1
- ๐ Build fixes with MKL. 1
- Error message when CMake CUDA Compute Detection fails. 1
- ๐ Several CMake build issues with Xcode generator fixed.
1 2 - ๐ Fix multiple OpenCL definitions at link time. 1
- ๐ Fix lapacke detection in CMake. 1
- โก๏ธ Update build tags of
- ๐ Fix builds with GCC 6.1.1 and GCC 5.3.0. 1
Installers
- ๐ All installers now ship with ArrayFire libraries build with MKL 2016.
- All installers now ship with Forge development files and examples included.
- ๐ CUDA Compute 2.0 has been removed from the installers. Please contact us
directly if you have a special need.
Examples
- โ Added [example simulating gravity](ref graphics/field.cpp) for
demonstration of vector field. - Improvements to financial/black_scholes_options.cpp example.
- ๐ Improvements to graphics/gravity_sim.cpp example.
- ๐ Fix graphics examples to use af::Window::setAxesLimits and
af::Window::setAxesTitles functions.
๐ Documentation & Licensing
- ArrayFire copyright and trademark policy
- ๐ Fixed grammar in license.
- โ Add license information for glbinding.
- โ Remove license infomation for GLEW.
- Random123 now applies to all backends.
- Random number functions are now under random_mat.
๐ Deprecations
๐ The following functions have been deprecated and may be modified or removed
permanently from future versions of ArrayFire.- af::Window::plot3(): Use af::Window::plot instead.
- af_draw_plot(): Use af_draw_plot_nd or af_draw_plot_2d instead.
- af_draw_plot3(): Use af_draw_plot_nd or af_draw_plot_3d instead.
- af::Window::scatter3(): Use af::Window::scatter instead.
- af_draw_scatter(): Use af_draw_scatter_nd or af_draw_scatter_2d instead.
- af_draw_scatter3(): Use af_draw_scatter_nd or af_draw_scatter_3d instead.
Known Issues
โ Certain CUDA functions are known to be broken on Tegra K1. The following ArrayFire tests are currently failing:
- assign_cuda
- harris_cuda
- homography_cuda
- median_cuda
- orb_cudasort_cuda
- sort_by_key_cuda
- sort_index_cuda