ArrayFire v3.8.rc Release Notes

Release Date: 2020-10-05 // over 3 years ago
  • ๐Ÿš€ v3.8.0 Release Candidate

    ๐Ÿ†• New Functions

    • Ragged max reduction - #2786
    • ๐ŸŽ‰ Initialization list constructor for array class - #2829 , #2987
    • ๐Ÿ†• New API for following statistics function: cov, var and stdev - #2986
    • ๐Ÿ‘ Bit-wise operator support for array and C API (af_bitnot) - #2865
    • allocV2 and freeV2 which return cl_mem on OpenCL backend - #2911
    • ๐Ÿšš Move constructor and move assignment operator for Dim4 class - #2946

    ๐Ÿ‘Œ Improvements

    • โž• Add f16 support for histogram - #2984
    • โšก๏ธ Update confidence connected components example for better illustration - #2968
    • Enable disk caching of OpenCL kernel binaries - #2970
    • ๐Ÿ”จ Refactor extension of kernel binaries stored to disk .bin - #2970
    • โž• Add minimum driver versions for CUDA toolkit 11 in internal map - #2982
    • ๐Ÿ‘Œ Improve warnings messages from run-time kernel compilation functions - #2996

    ๐Ÿ›  Fixes

    • ๐Ÿ›  Fix bias factor of variance in var_all and cov functions - #2986
    • ๐Ÿ›  Fix a race condition in confidence connected components function for OpenCL backend - #2969
    • Safely ignore disk cache failures in CUDA backend for compiled kernel binaries - #2970
    • ๐Ÿ›  Fix randn by passing in correct values to Box-Muller - #2980
    • ๐Ÿ›  Fix rounding issues in Box-Muller function used for RNG - #2980
    • ๐Ÿ›  Fix problems in RNG for older compute architectures with fp16 - #2980#2996
    • ๐Ÿ›  Fix performance regression of approx functions - #2977
    • โœ‚ Remove assert that check that signal/filter types have to be the same - #2993
    • ๐Ÿ›  Fix checkAndSetDevMaxCompute when the device cc is greater than max - #2996
    • ๐Ÿ›  Fix documentation errors and warnings - #2973 , #2987
    • โž• Add missing opencl-arrayfire interoperability functions in unified back - #2981

    Contributions

    Special thanks to our contributors: P. J. Reed


Previous changes from v3.7.2

  • v3.7.2

    ๐Ÿ‘Œ Improvements

    • Cache CUDA kernels to disk to improve load times(Thanks to @cschreib-ibex) #2848
    • Staticly link against cuda libraries #2785
    • ๐Ÿ— Make cuDNN an optional build dependency #2836
    • ๐Ÿ‘Œ Improve support for different compilers and OS #2876 #2945 #2925 #2942 #2943 #2945
    • ๐Ÿ‘Œ Improve performance of join and transpose on CPU #2849
    • ๐Ÿ‘Œ Improve documentation #2816 #2821 #2846 #2918 #2928 #2947
    • โฌ‡๏ธ Reduce binary size using NVRTC and template reducing instantiations #2849 #2861 #2890
    • ๐Ÿ‘Œ Improve reduceByKey performance on OpenCL by using builtin functions #2851
    • ๐Ÿ‘Œ Improve support for Intel OpenCL GPUs #2855
    • ๐Ÿ‘ Allow staticly linking against MKL #2877 (Sponsered by SDL)
    • ๐Ÿ‘ Better support for older CUDA toolkits #2923
    • โž• Add support for CUDA 11 #2939
    • โž• Add support for ccache for faster builds #2931
    • โž• Add support for the conan package manager on linux #2875
    • ๐Ÿ— Propagate build errors up the stack in AFError exceptions #2948 #2957
    • ๐Ÿ‘Œ Improve runtime dependency library loading #2954
    • ๐Ÿ‘Œ Improved cuDNN runtime checks and warnings #2960
    • Document af_memory_manager_* native memory return values #2911
    • โž• Add support for cuDNN 8 #2963

    ๐Ÿ›  Fixes

    • ๐Ÿ› Bug crash when allocating large arrays #2827
    • ๐Ÿ›  Fix various compiler warnings #2827 #2849 #2872 #2876
    • ๐Ÿ›  Fix minor leaks in OpenCL functions #2913
    • ๐Ÿ›  Various continuous integration related fixes #2819
    • ๐Ÿ›  Fix zero padding with convolv2NN #2820
    • Fix af_get_memory_pressure_threshold return value #2831
    • Increased the max filter length for morph
    • ๐Ÿ– Handle empty array inputs for LU, QR, and Rank functions #2838
    • ๐Ÿ›  Fix FindMKL.cmake script for sequential threading library #2840
    • ๐Ÿ”จ Various internal refactoring #2839 #2861 #2864 #2873 #2890 #2891 #2913
    • ๐Ÿ›  Fix OpenCL 2.0 builtin function name conflict #2851
    • ๐Ÿ›  Fix error caused when releasing memory with multiple devices #2867
    • ๐Ÿ›  Fix missing set stacktrace symbol from unified API #2915
    • ๐Ÿ›  Fix zero padding issue in convolve2NN #2820
    • ๐Ÿ›  Fixed bugs in ReduceByKey #2957
    • โž• Add clblast patch to handle custom context with multiple devices #2967

    Contributions

    Special thanks to our contributors:
    Corentin Schreiber
    Jacob Kahn
    Paul Jurczak
    Christoph Junghans