All Versions
15
Latest Version
Avg Release Cycle
118 days
Latest Release
151 days ago

Changelog History
Page 1

  • v3.8.rc

    October 05, 2020

    ๐Ÿš€ v3.8.0 Release Candidate

    ๐Ÿ†• New Functions

    • Ragged max reduction - #2786
    • ๐ŸŽ‰ Initialization list constructor for array class - #2829 , #2987
    • ๐Ÿ†• New API for following statistics function: cov, var and stdev - #2986
    • ๐Ÿ‘ Bit-wise operator support for array and C API (af_bitnot) - #2865
    • allocV2 and freeV2 which return cl_mem on OpenCL backend - #2911
    • ๐Ÿšš Move constructor and move assignment operator for Dim4 class - #2946

    ๐Ÿ‘Œ Improvements

    • โž• Add f16 support for histogram - #2984
    • โšก๏ธ Update confidence connected components example for better illustration - #2968
    • Enable disk caching of OpenCL kernel binaries - #2970
    • โ™ป๏ธ Refactor extension of kernel binaries stored to disk .bin - #2970
    • โž• Add minimum driver versions for CUDA toolkit 11 in internal map - #2982
    • ๐Ÿ‘Œ Improve warnings messages from run-time kernel compilation functions - #2996

    ๐Ÿ›  Fixes

    • ๐Ÿ›  Fix bias factor of variance in var_all and cov functions - #2986
    • ๐Ÿ›  Fix a race condition in confidence connected components function for OpenCL backend - #2969
    • Safely ignore disk cache failures in CUDA backend for compiled kernel binaries - #2970
    • ๐Ÿ›  Fix randn by passing in correct values to Box-Muller - #2980
    • ๐Ÿ›  Fix rounding issues in Box-Muller function used for RNG - #2980
    • ๐Ÿ›  Fix problems in RNG for older compute architectures with fp16 - #2980#2996
    • ๐Ÿ›  Fix performance regression of approx functions - #2977
    • โœ‚ Remove assert that check that signal/filter types have to be the same - #2993
    • ๐Ÿ›  Fix checkAndSetDevMaxCompute when the device cc is greater than max - #2996
    • ๐Ÿ›  Fix documentation errors and warnings - #2973 , #2987
    • โž• Add missing opencl-arrayfire interoperability functions in unified back - #2981

    Contributions

    Special thanks to our contributors: P. J. Reed

  • v3.7.3

    November 23, 2020

    v3.7.3

    ๐Ÿ‘Œ Improvements

    • โž• Add f16 support for histogram - #2984
    • โšก๏ธ Update confidence connected components example for better illustration - #2968
    • Enable disk caching of OpenCL kernel binaries - #2970
    • โ™ป๏ธ Refactor extension of kernel binaries stored to disk .bin - #2970
    • โž• Add minimum driver versions for CUDA toolkit 11 in internal map - #2982
    • ๐Ÿ‘Œ Improve warnings messages from run-time kernel compilation functions - #2996

    ๐Ÿ›  Fixes

    • ๐Ÿ›  Fix bias factor of variance in var_all and cov functions - #2986
    • ๐Ÿ›  Fix a race condition in confidence connected components function for OpenCL backend - #2969
    • Safely ignore disk cache failures in CUDA backend for compiled kernel binaries - #2970
    • ๐Ÿ›  Fix randn by passing in correct values to Box-Muller - #2980
    • ๐Ÿ›  Fix rounding issues in Box-Muller function used for RNG - #2980
    • ๐Ÿ›  Fix problems in RNG for older compute architectures with fp16 - #2980#2996
    • ๐Ÿ›  Fix performance regression of approx functions - #2977
    • โœ‚ Remove assert that check that signal/filter types have to be the same - #2993
    • ๐Ÿ›  Fix checkAndSetDevMaxCompute when the device cc is greater than max - #2996
    • ๐Ÿ›  Fix documentation errors and warnings - #2973 , #2987
    • โž• Add missing opencl-arrayfire interoperability functions in unified back - #2981
    • ๐Ÿ›  Fix constexpr relates compilation error with VS2019 and Clang Compilers - #3049

    Contributions

    Special thanks to our contributors: P. J. Reed

  • v3.7.2

    July 13, 2020

    v3.7.2

    ๐Ÿ‘Œ Improvements

    • Cache CUDA kernels to disk to improve load times(Thanks to @cschreib-ibex) #2848
    • Staticly link against cuda libraries #2785
    • ๐Ÿ— Make cuDNN an optional build dependency #2836
    • ๐Ÿ‘Œ Improve support for different compilers and OS #2876 #2945 #2925 #2942 #2943 #2945
    • ๐Ÿ‘Œ Improve performance of join and transpose on CPU #2849
    • ๐Ÿ‘Œ Improve documentation #2816 #2821 #2846 #2918 #2928 #2947
    • โฌ‡๏ธ Reduce binary size using NVRTC and template reducing instantiations #2849 #2861 #2890
    • ๐Ÿ‘Œ Improve reduceByKey performance on OpenCL by using builtin functions #2851
    • ๐Ÿ‘Œ Improve support for Intel OpenCL GPUs #2855
    • ๐Ÿ‘ Allow staticly linking against MKL #2877 (Sponsered by SDL)
    • ๐Ÿ‘ Better support for older CUDA toolkits #2923
    • โž• Add support for CUDA 11 #2939
    • โž• Add support for ccache for faster builds #2931
    • โž• Add support for the conan package manager on linux #2875
    • ๐Ÿ— Propagate build errors up the stack in AFError exceptions #2948 #2957
    • ๐Ÿ‘Œ Improve runtime dependency library loading #2954
    • ๐Ÿ‘Œ Improved cuDNN runtime checks and warnings #2960
    • Document af_memory_manager_* native memory return values #2911
    • โž• Add support for cuDNN 8 #2963

    ๐Ÿ›  Fixes

    • ๐Ÿ› Bug crash when allocating large arrays #2827
    • ๐Ÿ›  Fix various compiler warnings #2827 #2849 #2872 #2876
    • ๐Ÿ›  Fix minor leaks in OpenCL functions #2913
    • ๐Ÿ›  Various continuous integration related fixes #2819
    • ๐Ÿ›  Fix zero padding with convolv2NN #2820
    • Fix af_get_memory_pressure_threshold return value #2831
    • Increased the max filter length for morph
    • ๐Ÿ– Handle empty array inputs for LU, QR, and Rank functions #2838
    • ๐Ÿ›  Fix FindMKL.cmake script for sequential threading library #2840
    • โ™ป๏ธ Various internal refactoring #2839 #2861 #2864 #2873 #2890 #2891 #2913
    • ๐Ÿ›  Fix OpenCL 2.0 builtin function name conflict #2851
    • ๐Ÿ›  Fix error caused when releasing memory with multiple devices #2867
    • ๐Ÿ›  Fix missing set stacktrace symbol from unified API #2915
    • ๐Ÿ›  Fix zero padding issue in convolve2NN #2820
    • ๐Ÿ›  Fixed bugs in ReduceByKey #2957
    • โž• Add clblast patch to handle custom context with multiple devices #2967

    Contributions

    Special thanks to our contributors:
    Corentin Schreiber
    Jacob Kahn
    Paul Jurczak
    Christoph Junghans

  • v3.7.1

    March 28, 2020

    v3.7.1

    ๐Ÿ‘Œ Improvements

    • ๐Ÿ‘Œ Improve mtx download for test data #2742
    • ๐Ÿ‘Œ Improve Documentation #2754 #2792 #2797
    • โœ‚ Remove verbose messages in older CMake versions #2773
    • โฌ‡๏ธ Reduce binary size with the use of NVRTC #2790
    • ๐Ÿ‘‰ Use texture memory to load LUT in orb and fast #2791
    • โž• Add missing print function for f16 #2784
    • โž• Add checks for f16 support in the CUDA backend #2784
    • Create a thrust policy to intercept temporary buffer allocations #2806

    ๐Ÿ›  Fixes

    • ๐Ÿ›  Fix segfault on exit when ArrayFire is not initialized in the main thread
    • ๐Ÿ›  Fix support for CMake 3.5.1 #2771 #2772 #2760
    • ๐Ÿ›  Fix evalMultiple if the input array sizes aren't the same #2766
    • Fix error when AF_BACKEND_DEFAULT is passed directly to backend #2769
    • โ†ช Workaround name collision with AMD OpenCL implementation #2802
    • ๐Ÿ›  Fix on-exit errors with the unified backend #2769
    • ๐Ÿ›  Fix check for f16 compatibility in OpenCL #2773
    • ๐Ÿ›  Fix matmul on Intel OpenCL when passing same array as input #2774
    • ๐Ÿ›  Fix CPU OpenCL blas batching #2774
    • ๐Ÿ›  Fix memory pressure in the default memory manager #2801

    Contributions

    Special thanks to our contributors:
    padentomasello
    glavaux2

  • v3.7.0

    February 13, 2020

    v3.7.0

    โšก๏ธ Major Updates

    • โž• Added the ability to customize the memory manager(Thanks jacobkahn and flashlight) [#2461]
    • โž• Added 16-bit floating point support for several functions [#2413] [#2587] [#2585] [#2587] [#2583]
    • โž• Added sumByKey, productByKey, minByKey, maxByKey, allTrueByKey, anyTrueByKey, countByKey [#2254]
    • โž• Added confidence connected components [#2748]
    • โž• Added neural network based convolution and gradient functions [#2359]
    • โž• Added a padding function [#2682]
    • โž• Added pinverse for pseudo inverse [#2279]
    • โž• Added support for uniform ranges in approx1 and approx2 functions. [#2297]
    • โž• Added support to write to preallocated arrays for some functions [#2599] [#2481] [#2328] [#2327]
    • โž• Added meanvar function [#2258]
    • โž• Add support for sparse-sparse arithmetic support [#2312]
    • โž• Added rsqrt function for reciprocal square root [#2500]
    • โž• Added a lower level af_gemm function for general matrix multiplication [#2481]
    • โž• Added a function to set the cuBLAS math mode for the CUDA backend [#2584]
    • Separate debug symbols into separate files [#2535]
    • ๐Ÿ–จ Print stacktraces on errors [#2632]
    • ๐Ÿ‘Œ Support move constructor for af::array [#2595]
    • ๐Ÿ”ฆ Expose events in the public API [#2461]
    • โž• Add setAxesLabelFormat to format labels on graphs [#2495]

    ๐Ÿ‘Œ Improvements

    • ๐Ÿ‘ Better error messages for systems with driver or device incompatibilities [#2678] [#2448][#2761]
    • โšก๏ธ Optimized unified backend function calls [#2695]
    • โšก๏ธ Optimized anisotropic smoothing [#2713]
    • โšก๏ธ Optimized canny filter for CUDA and OpenCL [#2727]
    • ๐Ÿ‘ Better MKL search script [#2738][#2743][#2745]
    • ๐Ÿ‘ Better logging of different submodules in ArrayFire [#2670] [#2669]
    • ๐Ÿ‘Œ Improve documentation [#2665] [#2620] [#2615] [#2639] [#2628] [#2633] [#2622] [#2617] [#2558] [#2326][#2515]
    • โšก๏ธ Optimized af::array assignment [#2575]
    • โšก๏ธ Update the k-means example to display the result [#2521]

    ๐Ÿ›  Fixes

    • ๐Ÿ›  Fix multi-config generators [#2736]
    • ๐Ÿ›  Fix access errors in canny [#2727]
    • ๐Ÿ›  Fix segfault in the unified backend if no backends are available [#2720]
    • ๐Ÿ›  Fix access errors in scan-by-key [#2693]
    • ๐Ÿ›  Fix sobel operator [#2600]
    • ๐Ÿ›  Fix an issue with the random number generator and s16 [#2587]
    • ๐Ÿ›  Fix issue with boolean product reduction [#2544]
    • ๐Ÿ›  Fix array_proxy move constructor [#2537]
    • ๐Ÿ›  Fix convolve3 launch configuration [#2519]
    • ๐Ÿ›  Fix an issue where the fft function modified the input array [#2520]
    • โž• Added a work around for nvidia-opencl runtime if forge dependencies are missing [#2761]

    Contributions

    Special thanks to our contributors:
    @jacobkahn
    @WilliamTambellini
    @lehins
    @r-barnes
    @gaika
    @ShalokShalom

  • v3.6.4

    May 20, 2019

    v3.6.4

    The source code with sub-modules can be downloaded directly from the following link:

    http://arrayfire.com/arrayfire_source/arrayfire-full-3.6.4.tar.bz2

    ๐Ÿ›  Fixes

    • โž• Address a JIT performance regression due to moving kernel arguments to shared memory #2501
    • ๐Ÿ›  Fix the default parameter for setAxisTitle #2491
  • v3.6.3

    April 22, 2019

    v3.6.3

    The source code with sub-modules can be downloaded directly from the following link:

    http://arrayfire.com/arrayfire_source/arrayfire-full-3.6.3.tar.bz2

    ๐Ÿ‘Œ Improvements

    • Graphics are now a runtime dependency instead of a link time dependency #2365
    • โฌ‡๏ธ Reduce the CUDA backend binary size using runtime compilation of kernels #2437
    • Improved batched matrix multiplication on the CPU backend by using Intel MKL's cblas_Xgemm_batched#2206
    • Print JIT kernels to disk or stream using the AF_JIT_KERNEL_TRACE environment variable #2404
    • void* pointers are now allowed as arguments to af::array::write() #2367
    • Slightly improve the efficiency of JITed tile operations #2472
    • ๐Ÿ‘‰ Make the random number generation on the CPU backend to be consistent with CUDA and OpenCL #2435
    • ๐Ÿ– Handled very large JIT tree generations #2484 #2487

    ๐Ÿ› Bug Fixes

    • ๐Ÿ›  Fixed af::array::array_proxy move assignment operator #2479
    • ๐Ÿ›  Fixed input array dimensions validation in svdInplace() #2331
    • ๐Ÿ›  Fixed the typedef declaration for window resource handle #2357.
    • Increase compatibility with GCC 8 #2379
    • ๐Ÿ›  Fixed af::write tests #2380
    • ๐Ÿ›  Fixed a bug in broadcast step of 1D exclusive scan #2366
    • ๐Ÿ›  Fixed OpenGL related build errors on OSX #2382
    • ๐Ÿ›  Fixed multiple array evaluation. Performance improvement. #2384
    • ๐Ÿ›  Fixed buffer overflow and expected output of kNN SSD small test #2445
    • ๐Ÿ›  Fixed MKL linking order to enable threaded BLAS #2444
    • โž• Added validations for forge module plugin availability before calling resource cleanup #2443
    • Improve compatibility on MSVC toolchain(_MSC_VER > 1914) with the CUDA backend #2443
    • ๐Ÿ›  Fixed BLAS gemm func generators for newest MSVC 19 on VS 2017 #2464
    • ๐Ÿ›  Fix errors on exits when using the cuda backend with unified #2470

    ๐Ÿ“š Documentation

    • ๐Ÿ›  Updated svdInplace() documentation following a bugfix #2331
    • ๐Ÿ›  Fixed a typo in matrix multiplication documentation #2358
    • ๐Ÿ›  Fixed a code snippet demonstrating C-API use #2406
    • โšก๏ธ Updated hamming matcher implementation limitation #2434
    • โž• Added illustration for the rotate function #2453

    Misc

    • ๐Ÿ‘‰ Use cudaMemcpyAsync instead of cudaMemcpy throughout the codebase #2362
    • Display a more informative error message if CUDA driver is incompatible #2421 #2448
    • ๐Ÿ”„ Changed forge resource management to use smart pointers #2452
    • ๐Ÿ—„ Deprecated intl and uintl typedefs in API #2360
    • ๐Ÿ— Enabled graphics by default for all builds starting with v3.6.3 #2365
    • ๐Ÿ›  Fixed several warnings #2344 #2356 #2361
    • โ™ป๏ธ Refactored initArray() calls to use createEmptyArray(). initArray() is for internal use only by Array class. #2361
    • โ™ป๏ธ Refactored void* memory allocations to use unsigned char type #2459
    • ๐Ÿ—„ Replaced deprecated MKL API with in-house implementations for sparse to sparse/dense conversions #2312
    • ๐Ÿ›  Reorganized and fixed some internal backend API #2356
    • โšก๏ธ Updated compilation order of CUDA files to speed up compile time #2368
    • โœ‚ Removed conditional graphics support builds after enabling runtime loading of graphics dependencies #2365
    • Marked graphics dependencies as optional in CPack RPM config #2365
    • โ™ป๏ธ Refactored a sparse arithmetic backend API #2379
    • Fixed const correctness of af_device_array API #2396
    • โšก๏ธ Update Forge to v1.0.4 #2466
    • Manage Forge resources from the DeviceManager class #2381
    • ๐Ÿ›  Fixed non-mkl & non-batch blas upstream call arguments #2401
    • ๐Ÿ”— Link MKL with OpenMP instead of TBB by default
    • ๐Ÿ‘‰ use clang-format to format source code

    Contributions

    Special thanks to our contributors:
    Alessandro Bessi
    zhihaoy
    Jacob Khan
    William Tambellini

  • v3.6.2

    November 29, 2018

    v3.6.2

    The source code with sub-modules can be downloaded directly from the following link:

    http://arrayfire.com/arrayfire_source/arrayfire-full-3.6.2.tar.bz2

    ๐Ÿ”‹ Features

    • ๐Ÿ‘ Batching support for cond argument in select() [#2243]
    • Broadcast batching for matmul [#2315]
    • โž• Add support for multiple nearest neighbours from nearestNeighbour() [#2280]

    ๐Ÿ‘Œ Improvements

    • ๐ŸŽ Performance improvements in morph() [#2238]
    • ๐Ÿ›  Fix linking errors when compiling without Freeimage/Graphics [#2248]
    • ๐Ÿ›  Fixes to improve the usage of ArrayFire as a sub-project [#2290]
    • ๐Ÿ‘ Allow custom library path for loading dynamic backend libraries [#2302]

    ๐Ÿ› Bug fixes

    • ๐Ÿ›  Fix overflow in dim4::ndims. [#2289]
    • โœ‚ Remove setDevice from af::array destructor [#2319]
    • ๐Ÿ›  Fix pow precision for integral types [#2305]
    • ๐Ÿ›  Fix issues with tile with a large repeat dimension [#2307]
    • Fix grid based indexing calculation in af_draw_hist [#2230]
    • ๐Ÿ›  Fix bug when using an af::array for indexing [#2311]
    • ๐Ÿ›  Fix CLBlast errors on exit on Windows [#2222]

    ๐Ÿ“š Documentation

    • ๐Ÿ‘Œ Improve unwrap documentation [#2301]
    • ๐Ÿ‘Œ Improve wrap documentation [#2320]
    • ๐Ÿ›  Fix and improve accum documentation [#2298]
    • ๐Ÿ‘Œ Improve tile documentation [#2293]
    • ๐Ÿ“š Clarify approx* indexing in documentation [#2287]
    • ๐Ÿ“š Update examples of select in detailed documentation [#2277]
    • โšก๏ธ Update lookup examples [#2288]
    • ๐Ÿ“š Update set documentation [#2299]

    Misc

    • ๐Ÿ†• New ArrayFire ASSERT utility functions [#2249][#2256][#2257][#2263]
    • ๐Ÿ‘Œ Improve error messages in JIT [#2309]
    • af* library and dependencies directory changed to lib64 [#2186]

    Contributions

    Thank you to our contributors:
    Jacob Kahn
    Vardan Akopian

  • v3.6.1

    July 06, 2018

    v 3.6.1

    ๐Ÿš€ The source code for this release can be downloaded here:
    http://arrayfire.com/arrayfire_source/arrayfire-full-3.6.1.tar.bz2

    ๐Ÿ‘Œ Improvements

    • FreeImage is now a run-time dependency [#2164]
    • โฌ‡๏ธ Reduced binary size by setting the symbol visibility to hidden [#2168]
    • โž• Add logging to memory manager and unified loader using the AF_TRACE environment variable [#2169][#2216]
    • ๐Ÿ‘Œ Improved CPU Anisotropic Diffusion performance [#2174]
    • Perform normalization after FFT for improved accuracy [#2185, #2192]
    • โšก๏ธ Updated CLBlast to v1.4.0 [#2178]
    • โž• Added additional validation when using af::seq for indexing [#2153]
    • ๐Ÿ‘ Perform checks for unsupported cards by the CUDA implementation [#2182]
    • Avoid selecting backend if no devices are found. [#2218]

    ๐Ÿ› Bug Fixes

    • ๐Ÿ›  Fixed region when all pixels were the foreground or background [#2152]
    • ๐Ÿ›  Fixed several memory leaks [#2202, #2201, #2180, #2179, #2177, #2175]
    • ๐Ÿ›  Fixed bug in setDevice which didn't allow you to select the last device [#2189]
    • ๐Ÿ›  Fixed bug in min/max where the first element of the array was a NaN value [#2155]
    • ๐Ÿ›  Fixed graphics window indexing [#2207]
    • ๐Ÿ›  Fixed renaming issue when installing cuda libraries on OSX [#2221]
    • ๐Ÿ›  Fixed NSIS installer PATH variable [#2223]
  • v3.6.0

    May 04, 2018

    v3.6.0

    The source code with submodules can be downloaded directly from the following link:
    http://arrayfire.com/arrayfire_source/arrayfire-full-3.6.0.tar.bz2

    โšก๏ธ Major Updates

    • Added the topk() function. 1
    • Added batched matrix multiply support.2 3
    • Added anisotropic diffusion, anisotropicDiffusion().Documentation 3.

    ๐Ÿ”‹ Features

    • Added support for batched matrix multiply. 1 2
    • New anisotropic diffusion function, anisotropicDiffusion(). Documentation 3.
    • New topk() function, which returns the top k elements along a given dimension of the input. Documentation. 4
    • ๐Ÿ–จ New gradient diffusion example.

    ๐Ÿ‘Œ Improvements

    • JITed select() and shift() functions for CUDA and OpenCL backends. 1
    • Significant CMake improvements. 2 3 4
    • ๐Ÿ‘Œ Improved the quality of the random number generator 5
    • โœ… Corrected assert function calls in select() tests. 5
    • Modified af_colormap struct to match forge's definition. 6
    • ๐Ÿ‘Œ Improved Black Scholes example. 7
    • ๐Ÿš€ Used CPack to generate installers. 8. We will be using CPack to generate installers beginning with this release.
    • Refactored black_scholes_options example to use built-in af::erfc function for cumulative normal distribution.9.
    • โฌ‡๏ธ Reduced the scope of mutexes in memory manager 10
    • Official installers do not require the CUDA toolkit to be installed starting with v3.6.0.

    ๐Ÿ› Bug fixes

    • โš  Fixed shfl_down() warnings with CUDA 9. 1
    • Disabled CUDA JIT debug flags on ARM architecture.2
    • ๐Ÿ›  Fixed CLBLast install lib dir for linux platform where lib directory has arch(64) suffix.3
    • ๐Ÿ›  Fixed assert condition in 3d morph opencl kernel.4
    • ๐Ÿ›  Fixed JIT errors with large non-linear kernels5
    • ๐Ÿ›  Fixed bug in CPU JIT after moddims was called 5
    • ๐Ÿ›  Fixed a deadlock scenario caused by the method MemoryManager::nativeFree6

    ๐Ÿ“š Documentation

    • ๐Ÿ›  Fixed variable name typo in vectorization.md. 1
    • Fixed AF_API_VERSION value in Doxygen config file. 2

    Known issues

    • ๐Ÿ‘ NVCC does not currently support platform toolset v141 (Visual Studio 2017 R15.6). Use the v140 platform toolset, instead. You may pass in the toolset version to CMake via the -T flag like so cmake -G "Visual Studio 15 2017 Win64" -T v140.
    • โœ… Several OpenCL tests failing on OSX:
      • canny_opencl, fft_opencl, gen_assign_opencl, homography_opencl, reduce_opencl, scan_by_key_opencl, solve_dense_opencl, sparse_arith_opencl, sparse_convert_opencl, where_opencl

    Contributions

    Special thanks to our contributors:
    Adrien F. Vincent, Cedric Nugteren, Felix, Filip Matzner, HoneyPatouceul, Patrick Lavin, Ralf Stubner, William Tambellini