alpaka v0.4.0 Release Notes

Release Date: 2020-01-14 // almost 2 years ago
  • DOI

    Compatibility Changes:

    • ➕ added support for CUDA 10.0, 10.1 and 10.2
    • ⬇️ dropped support for CUDA 7.0 and 7.5
    • ➕ added official support for Visual Studio 2017 on Windows with CUDA 10 (built on Travis CI instead of appveyor now)
    • ➕ added support for xcode10.2-11.3 (no official CUDA support yet)
    • ➕ added support for Ubuntu 18.04
    • ➕ added support for gcc 9
    • ➕ added support for clang 7.0, 8.0 and 9.0
    • ⬇️ dropped support for clang 3.5, 3.6, 3.7, 3.8 and 3.9
    • ➕ added support for CMake 3.13, 3.14, 3.15 and 3.16
    • ⬇️ dropped support for CMake 3.11.3 and lower, 3.11.4 is the lowest supported version
    • ➕ added support for Boost 1.69, 1.70 and 1.71
    • ➕ added support for usage of libc++ instead of libstdc++ for clang builds
    • removed dependency to Boost.MPL and BOOST_CURRENT_FUNCTION
    • ✅ replaced Boost.Test with Catch2 using an internal version of Catch2 by default but allowing to use an external one

    🐛 Bug Fixes:

    • 🛠 fixed some incorrect host/device function attributes
    • 🛠 fixed warning about comparison unsigned < 0
    • There is no need to disable all other backends manually when using ALPAKA_ACC_GPU_CUDA_ONLY_MODE anymore
    • 🛠 fixed static block shared memory of types with alignemnt higher than defaultAlignment
    • 🛠 fixed race-condition in HIP/NVCC queue
    • 🛠 fixed data races when a GPU updates host memory by aligning host memory buffers always to 4kib

    🆕 New Features:

    • ➕ Added a new alpaka Logo!
    • the whole alpaka code has been relicensed to MPL2 and the examples to ISC
    • added ALPAKA_CXX_STANDARD CMake option which allows to select the C++ standard to be used
    • added ALPAKA_CUDA_NVCC_SEPARABLE_COMPILATION option to enable separable compilation for nvcc
    • added ALPAKA_CUDA_NVCC_EXPT_EXTENDED_LAMBDA and ALPAKA_CUDA_NVCC_EXPT_RELAXED_CONSTEXPR CMake options to enable/disable those nvcc options (they were always ON before)
    • ➕ added headers for standalone usage without CMake (alpaka/standalone/GpuCudaRt.h, ...) which set the backend defines
    • ➕ added experimental HIP back-end with using nvcc (HIP >= 1.5.1 required, latest rocRand). More on HIP setup: doc/markdown/user/implementation/mapping/HIP.md
    • ➕ added sincos math function implementations
    • 👍 allowed to copy and move construct ViewPlainPtr
    • ➕ added support for CUDA atomics using "unsigned long int"
    • ➕ added compile-time error for atomic CUDA ops which are not available due to sm restrictions
    • ➕ added explicit errors for unsupported types/operations for CUDA atomics
    • replaced usages of assert with ALPAKA_ASSERT
    • 👌 replaced BOOST_VERIFY by ALPAKA_CHECK and returned success from all test kernels
    • added alpaka::ignore_unused as replacement for boost::ignore_unused

    💥 Breaking changes:

    • renamed Queue*Async to Queue*NonBlocking and Queue*Sync to Queue*Blocking
    • 📇 renamed alpaka::size::Size to alpaka::idx::Idx, alpaka::size::SizeType to alpaka::idx::IdxType (and TSize to TIdx internally)
    • replaced ALPAKA_FN_ACC_NO_CUDA by ALPAKA_FN_HOST
    • replaced ALPAKA_FN_ACC_CUDA_ONLY by direct usage of __device__
    • renamed ALPAKA_STATIC_DEV_MEM_CONSTANT to ALPAKA_STATIC_ACC_MEM_CONSTANT and ALPAKA_STATIC_DEV_MEM_GLOBAL to ALPAKA_STATIC_ACC_MEM_GLOBAL
    • 📇 renamed alpaka::kernel::createTaskExec to alpaka::kernel::createTaskKernel
    • QueueCpuSync now correctly blocks when called from multiple threads
      ** This broke some previous use-cases (e.g. usage within existing OpenMP parallel regions)
      ** This use case can now be handled with the support for external CPU queues as can bee seen in the example QueueCpuOmp2CollectiveImpl
    • previously it was possible to have kernels return values even though they were always ignored. Now kernels are checked to always return void
    • renamed all files with *Stl suffix to *StdLib
    • renamed BOOST_ARCH_CUDA_DEVICE to BOOST_ARCH_PTX
    • executors have been renamed due to the upcoming standard C++ feature with a different meaning. All files within alpaka/exec/ have been moved to alpaka/kernel/ and the files and classes have been renamed from Exec* to TaskKernel*. This should not affect users of alpaka but will affect extensions.

Previous changes from v0.3.6

  • 🐛 Bug Fixes:

    • 🛠 fix cuda stream race condition #850
    • 🛠 fix: cuda exceptions #844
    • math/abs: Added trait specialisation for double. #862
    • alpaka/math Overloaded float specialization #837
    • 🛠 Fixes name conflicts in alpaka math functions. #784

    DOI