alpaka v0.4.0 Release Notes
Release Date: 2020-01-14 // over 4 years ago-
Compatibility Changes:
- ➕ added support for CUDA 10.0, 10.1 and 10.2
- ⬇️ dropped support for CUDA 7.0 and 7.5
- ➕ added official support for Visual Studio 2017 on Windows with CUDA 10 (built on Travis CI instead of appveyor now)
- ➕ added support for xcode10.2-11.3 (no official CUDA support yet)
- ➕ added support for Ubuntu 18.04
- ➕ added support for gcc 9
- ➕ added support for clang 7.0, 8.0 and 9.0
- ⬇️ dropped support for clang 3.5, 3.6, 3.7, 3.8 and 3.9
- ➕ added support for CMake 3.13, 3.14, 3.15 and 3.16
- ⬇️ dropped support for CMake 3.11.3 and lower, 3.11.4 is the lowest supported version
- ➕ added support for Boost 1.69, 1.70 and 1.71
- ➕ added support for usage of libc++ instead of libstdc++ for clang builds
- removed dependency to Boost.MPL and BOOST_CURRENT_FUNCTION
- ✅ replaced Boost.Test with Catch2 using an internal version of Catch2 by default but allowing to use an external one
🐛 Bug Fixes:
- 🛠 fixed some incorrect host/device function attributes
- 🛠 fixed warning about comparison unsigned < 0
- There is no need to disable all other backends manually when using
ALPAKA_ACC_GPU_CUDA_ONLY_MODE
anymore - 🛠 fixed static block shared memory of types with alignemnt higher than defaultAlignment
- 🛠 fixed race-condition in HIP/NVCC queue
- 🛠 fixed data races when a GPU updates host memory by aligning host memory buffers always to 4kib
🆕 New Features:
- ➕ Added a new alpaka Logo!
- the whole alpaka code has been relicensed to MPL2 and the examples to ISC
- added
ALPAKA_CXX_STANDARD
CMake option which allows to select the C++ standard to be used - added
ALPAKA_CUDA_NVCC_SEPARABLE_COMPILATION
option to enable separable compilation for nvcc - added
ALPAKA_CUDA_NVCC_EXPT_EXTENDED_LAMBDA
andALPAKA_CUDA_NVCC_EXPT_RELAXED_CONSTEXPR
CMake options to enable/disable those nvcc options (they were always ON before) - ➕ added headers for standalone usage without CMake (
alpaka/standalone/GpuCudaRt.h
, ...) which set the backend defines - ➕ added experimental HIP back-end with using nvcc (HIP >= 1.5.1 required, latest rocRand). More on HIP setup: doc/markdown/user/implementation/mapping/HIP.md
- ➕ added
sincos
math function implementations - 👍 allowed to copy and move construct ViewPlainPtr
- ➕ added support for CUDA atomics using "unsigned long int"
- ➕ added compile-time error for atomic CUDA ops which are not available due to sm restrictions
- ➕ added explicit errors for unsupported types/operations for CUDA atomics
- replaced usages of
assert
withALPAKA_ASSERT
- 👌 replaced
BOOST_VERIFY
byALPAKA_CHECK
and returned success from all test kernels - added
alpaka::ignore_unused
as replacement forboost::ignore_unused
💥 Breaking changes:
- renamed
Queue*Async
toQueue*NonBlocking
andQueue*Sync
toQueue*Blocking
- 📇 renamed
alpaka::size::Size
toalpaka::idx::Idx
,alpaka::size::SizeType
toalpaka::idx::IdxType
(andTSize
toTIdx
internally) - replaced
ALPAKA_FN_ACC_NO_CUDA
byALPAKA_FN_HOST
- replaced
ALPAKA_FN_ACC_CUDA_ONLY
by direct usage of__device__
- renamed
ALPAKA_STATIC_DEV_MEM_CONSTANT
toALPAKA_STATIC_ACC_MEM_CONSTANT
andALPAKA_STATIC_DEV_MEM_GLOBAL
toALPAKA_STATIC_ACC_MEM_GLOBAL
- 📇 renamed
alpaka::kernel::createTaskExec
toalpaka::kernel::createTaskKernel
- QueueCpuSync now correctly blocks when called from multiple threads
** This broke some previous use-cases (e.g. usage within existing OpenMP parallel regions)
** This use case can now be handled with the support for external CPU queues as can bee seen in the exampleQueueCpuOmp2CollectiveImpl
- previously it was possible to have kernels return values even though they were always ignored. Now kernels are checked to always return void
- renamed all files with
*Stl
suffix to*StdLib
- renamed
BOOST_ARCH_CUDA_DEVICE
toBOOST_ARCH_PTX
- executors have been renamed due to the upcoming standard C++ feature with a different meaning. All files within
alpaka/exec/
have been moved toalpaka/kernel/
and the files and classes have been renamed fromExec*
toTaskKernel*
. This should not affect users of alpaka but will affect extensions.