ArrayFire v3.4.0 Release Notes

Release Date: 2016-09-13 // over 7 years ago
  • v3.4.0

    The source code with submodules can be downloaded directly from the following link:
    http://arrayfire.com/arrayfire_source/arrayfire-full-3.4.0.tar.bz2

    Installer CUDA Version: 7.5 (Required)
    Installer OpenCL Version: 1.2 (Minimum)

    โšก๏ธ Major Updates

    • ๐Ÿ“œ [Sparse Matrix and BLAS](ref sparse_func). 1 2
    • Faster JIT for CUDA and OpenCL. 1 2
    • ๐Ÿ‘Œ Support for [random number generator engines](ref af::randomEngine).
      1 2
    • ๐Ÿ‘Œ Improvements to graphics. 1 2

    ๐Ÿ”‹ Features

    • ๐Ÿ“œ [Sparse Matrix and BLAS](ref sparse_func) 1 2
      • Support for [CSR](ref AF_STORAGE_CSR) and [COO](ref AF_STORAGE_COO)
        [storage types](ref af_storage).
      • Sparse-Dense Matrix Multiplication and Matrix-Vector Multiplication as a
        part of af::matmul() using AF_STORAGE_CSR format for sparse.
      • Conversion to and from [dense](ref AF_STORAGE_DENSE) matrix to [CSR](ref AF_STORAGE_CSR)
        and [COO](ref AF_STORAGE_COO) [storage types](ref af_storage).
    • Faster JIT 1 2
      • Performance improvements for CUDA and OpenCL JIT functions.
      • Support for evaluating multiple outputs in a single kernel. See af::array::eval() for more.
    • [Random Number Generation](ref af::randomEngine)
      1 2
      • af::randomEngine(): A random engine class to handle setting the type and seed
        for random number generator engines.
      • Supported engine types are:
      • Philox
      • Threefry
      • Mersenne Twister
    • Graphics 1 2
      • Using Forge v0.9.0
      • [Vector Field](ref af::Window::vectorField) plotting functionality.
        1
      • Removed GLEW and replaced with glbinding.
      • Removed usage of GLEW after support for MX (multithreaded) was dropped in v2.0.
        1
      • Multiple overlays on the same window are now possible.
      • Overlays support for same type of object (2D/3D)
      • Supported by af::Window::plot, af::Window::hist, af::Window::surface,
        af::Window::vectorField.
      • New API to set axes limits for graphs.
      • Draw calls do not automatically compute the limits. This is now under user control.
      • af::Window::setAxesLimits can be used to set axes limits automatically or manually.
      • af::Window::setAxesTitles can be used to set axes titles.
      • New API for plot and scatter:
      • af::Window::plot() and af::Window::scatter() now can handle 2D and 3D and determine appropriate order.
      • af_draw_plot_nd()
      • af_draw_plot_2d()
      • af_draw_plot_3d()
      • af_draw_scatter_nd()
      • af_draw_scatter_2d()
      • af_draw_scatter_3d()
    • ๐Ÿ†• New [interpolation methods](ref af_interp_type)
      1
      • Applies to
      • af::resize()
      • af::transform()
      • af::approx1()
      • af::approx2()
    • ๐Ÿ‘Œ Support for [complex mathematical functions](ref mathfunc_mat)
      1
      • Add complex support for trig_mat, af::sqrt(), af::log().
    • ๐Ÿšฆ af::medfilt1(): Median filter for 1-d signals 1
    • Generalized scan functions: scan_func_scan and scan_func_scanbykey
      • Now supports inclusive or exclusive scans
      • Supports binary operations defined by af_binary_op.
        1
    • [Image Moments](ref moments_mat) functions
      1
    • โž• Add af::getSizeOf() function for af_dtype
      1
    • Explicitly extantiate af::array::device() for `void *
      1

    ๐Ÿ› Bug Fixes

    • ๐Ÿ›  Fixes to edge-cases in morph_mat. 1
    • ๐Ÿ‘‰ Makes JIT tree size consistent between devices. 1
    • Delegate higher-dimension in convolve_mat to correct dimensions. 1
    • Indexing fixes with C++11. 1 2
    • ๐Ÿ– Handle empty arrays as inputs in various functions. 1
    • ๐Ÿ›  Fix bug when single element input to af::median. 1
    • ๐Ÿ›  Fix bug in calculation of time from af::timeit(). 1
    • ๐Ÿ›  Fix bug in floating point numbers in af::seq. 1
    • ๐Ÿ›  Fixes for OpenCL graphics interop on NVIDIA devices.
      1
    • ๐Ÿ›  Fix bug when compiling large kernels for AMD devices.
      1
    • ๐Ÿ›  Fix bug in af::bilateral when shared memory is over the limit.
      1
    • ๐Ÿ›  Fix bug in kernel header compilation tool bin2cpp.
      1
    • ๐Ÿ›  Fix inital values for morph_mat functions.
      1
    • ๐Ÿ›  Fix bugs in af::homography() CPU and OpenCL kernels.
      1
    • ๐Ÿ›  Fix bug in CPU TNJ.
      1

    ๐Ÿ‘Œ Improvements

    • CUDA 8 and compute 6.x(Pascal) support, current installer ships with CUDA 7.5. 1 2 3
    • ๐Ÿ‘‰ User controlled FFT plan caching. 1
    • CUDA performance improvements for image_func_wrap, image_func_unwrap and approx_mat.
      1
    • ๐Ÿ‘ Fallback for CUDA-OpenGL interop when no devices does not support OpenGL.
      1
    • Additional forms of batching with the transform_func_transform functions.
      New behavior defined here.
      1
    • โšก๏ธ Update to OpenCL2 headers. 1
    • ๐Ÿ‘Œ Support for integration with external OpenCL contexts. 1
    • ๐ŸŽ Performance improvements to interal copy in CPU Backend.
      1
    • ๐ŸŽ Performance improvements to af::select and af::replace CUDA kernels.
      1
    • 0๏ธโƒฃ Enable OpenCL-CPU offload by default for devices with Unified Host Memory.
      1
      • To disable, use the environment variable AF_OPENCL_CPU_OFFLOAD=0.

    ๐Ÿ— Build

    • Compilation speedups. 1
    • ๐Ÿ— Build fixes with MKL. 1
    • Error message when CMake CUDA Compute Detection fails. 1
    • ๐Ÿ— Several CMake build issues with Xcode generator fixed.
      1 2
    • ๐Ÿ›  Fix multiple OpenCL definitions at link time. 1
    • ๐Ÿ›  Fix lapacke detection in CMake. 1
    • โšก๏ธ Update build tags of
    • ๐Ÿ›  Fix builds with GCC 6.1.1 and GCC 5.3.0. 1

    Installers

    • ๐Ÿ— All installers now ship with ArrayFire libraries build with MKL 2016.
    • All installers now ship with Forge development files and examples included.
    • ๐Ÿšš CUDA Compute 2.0 has been removed from the installers. Please contact us
      directly if you have a special need.

    Examples

    • โž• Added [example simulating gravity](ref graphics/field.cpp) for
      demonstration of vector field.
    • Improvements to financial/black_scholes_options.cpp example.
    • ๐Ÿ‘Œ Improvements to graphics/gravity_sim.cpp example.
    • ๐Ÿ›  Fix graphics examples to use af::Window::setAxesLimits and
      af::Window::setAxesTitles functions.

    ๐Ÿ“š Documentation & Licensing

    • ArrayFire copyright and trademark policy
    • ๐Ÿ›  Fixed grammar in license.
    • โž• Add license information for glbinding.
    • โœ‚ Remove license infomation for GLEW.
    • Random123 now applies to all backends.
    • Random number functions are now under random_mat.

    ๐Ÿ—„ Deprecations

    ๐Ÿšš The following functions have been deprecated and may be modified or removed
    permanently from future versions of ArrayFire.

    • af::Window::plot3(): Use af::Window::plot instead.
    • af_draw_plot(): Use af_draw_plot_nd or af_draw_plot_2d instead.
    • af_draw_plot3(): Use af_draw_plot_nd or af_draw_plot_3d instead.
    • af::Window::scatter3(): Use af::Window::scatter instead.
    • af_draw_scatter(): Use af_draw_scatter_nd or af_draw_scatter_2d instead.
    • af_draw_scatter3(): Use af_draw_scatter_nd or af_draw_scatter_3d instead.

    Known Issues

    โœ… Certain CUDA functions are known to be broken on Tegra K1. The following ArrayFire tests are currently failing:

    • assign_cuda
    • harris_cuda
    • homography_cuda
    • median_cuda
    • orb_cudasort_cuda
    • sort_by_key_cuda
    • sort_index_cuda