alpaka latest version

« Changelog History

alpaka v0.4.0 Release Notes

Release Date: 2020-01-14 // over 4 years ago

Compatibility Changes:
- ➕ added support for CUDA 10.0, 10.1 and 10.2
- ⬇️ dropped support for CUDA 7.0 and 7.5
- ➕ added official support for Visual Studio 2017 on Windows with CUDA 10 (built on Travis CI instead of appveyor now)
- ➕ added support for xcode10.2-11.3 (no official CUDA support yet)
- ➕ added support for Ubuntu 18.04
- ➕ added support for gcc 9
- ➕ added support for clang 7.0, 8.0 and 9.0
- ⬇️ dropped support for clang 3.5, 3.6, 3.7, 3.8 and 3.9
- ➕ added support for CMake 3.13, 3.14, 3.15 and 3.16
- ⬇️ dropped support for CMake 3.11.3 and lower, 3.11.4 is the lowest supported version
- ➕ added support for Boost 1.69, 1.70 and 1.71
- ➕ added support for usage of libc++ instead of libstdc++ for clang builds
- removed dependency to Boost.MPL and BOOST_CURRENT_FUNCTION
- ✅ replaced Boost.Test with Catch2 using an internal version of Catch2 by default but allowing to use an external one
🐛 Bug Fixes:
- 🛠 fixed some incorrect host/device function attributes
- 🛠 fixed warning about comparison unsigned < 0
- There is no need to disable all other backends manually when using ALPAKA_ACC_GPU_CUDA_ONLY_MODE anymore
- 🛠 fixed static block shared memory of types with alignemnt higher than defaultAlignment
- 🛠 fixed race-condition in HIP/NVCC queue
- 🛠 fixed data races when a GPU updates host memory by aligning host memory buffers always to 4kib
🆕 New Features:
- ➕ Added a new alpaka Logo!
- the whole alpaka code has been relicensed to MPL2 and the examples to ISC
- added ALPAKA_CXX_STANDARD CMake option which allows to select the C++ standard to be used
- added ALPAKA_CUDA_NVCC_SEPARABLE_COMPILATION option to enable separable compilation for nvcc
- added ALPAKA_CUDA_NVCC_EXPT_EXTENDED_LAMBDA and ALPAKA_CUDA_NVCC_EXPT_RELAXED_CONSTEXPR CMake options to enable/disable those nvcc options (they were always ON before)
- ➕ added headers for standalone usage without CMake (alpaka/standalone/GpuCudaRt.h, ...) which set the backend defines
- ➕ added experimental HIP back-end with using nvcc (HIP >= 1.5.1 required, latest rocRand). More on HIP setup: doc/markdown/user/implementation/mapping/HIP.md
- ➕ added sincos math function implementations
- 👍 allowed to copy and move construct ViewPlainPtr
- ➕ added support for CUDA atomics using "unsigned long int"
- ➕ added compile-time error for atomic CUDA ops which are not available due to sm restrictions
- ➕ added explicit errors for unsupported types/operations for CUDA atomics
- replaced usages of assert with ALPAKA_ASSERT
- 👌 replaced BOOST_VERIFY by ALPAKA_CHECK and returned success from all test kernels
- added alpaka::ignore_unused as replacement for boost::ignore_unused
💥 Breaking changes:
- renamed Queue*Async to Queue*NonBlocking and Queue*Sync to Queue*Blocking
- 📇 renamed alpaka::size::Size to alpaka::idx::Idx, alpaka::size::SizeType to alpaka::idx::IdxType (and TSize to TIdx internally)
- replaced ALPAKA_FN_ACC_NO_CUDA by ALPAKA_FN_HOST
- replaced ALPAKA_FN_ACC_CUDA_ONLY by direct usage of __device__
- renamed ALPAKA_STATIC_DEV_MEM_CONSTANT to ALPAKA_STATIC_ACC_MEM_CONSTANT and ALPAKA_STATIC_DEV_MEM_GLOBAL to ALPAKA_STATIC_ACC_MEM_GLOBAL
- 📇 renamed alpaka::kernel::createTaskExec to alpaka::kernel::createTaskKernel
- QueueCpuSync now correctly blocks when called from multiple threads
  ** This broke some previous use-cases (e.g. usage within existing OpenMP parallel regions)
  ** This use case can now be handled with the support for external CPU queues as can bee seen in the example QueueCpuOmp2CollectiveImpl
- previously it was possible to have kernels return values even though they were always ignored. Now kernels are checked to always return void
- renamed all files with *Stl suffix to *StdLib
- renamed BOOST_ARCH_CUDA_DEVICE to BOOST_ARCH_PTX
- executors have been renamed due to the upcoming standard C++ feature with a different meaning. All files within alpaka/exec/ have been moved to alpaka/kernel/ and the files and classes have been renamed from Exec* to TaskKernel*. This should not affect users of alpaka but will affect extensions.

Previous changes from v0.3.6

🐛 Bug Fixes:
- 🛠 fix cuda stream race condition #850
- 🛠 fix: cuda exceptions #844
- math/abs: Added trait specialisation for double. #862
- alpaka/math Overloaded float specialization #837
- 🛠 Fixes name conflicts in alpaka math functions. #784