All Versions
10
Latest Version
Avg Release Cycle
183 days
Latest Release
1827 days ago
Changelog History
Changelog History
-
v0.4.0 Changes
January 14, 2020Compatibility Changes:
- ➕ added support for CUDA 10.0, 10.1 and 10.2
- ⬇️ dropped support for CUDA 7.0 and 7.5
- ➕ added official support for Visual Studio 2017 on Windows with CUDA 10 (built on Travis CI instead of appveyor now)
- ➕ added support for xcode10.2-11.3 (no official CUDA support yet)
- ➕ added support for Ubuntu 18.04
- ➕ added support for gcc 9
- ➕ added support for clang 7.0, 8.0 and 9.0
- ⬇️ dropped support for clang 3.5, 3.6, 3.7, 3.8 and 3.9
- ➕ added support for CMake 3.13, 3.14, 3.15 and 3.16
- ⬇️ dropped support for CMake 3.11.3 and lower, 3.11.4 is the lowest supported version
- ➕ added support for Boost 1.69, 1.70 and 1.71
- ➕ added support for usage of libc++ instead of libstdc++ for clang builds
- removed dependency to Boost.MPL and BOOST_CURRENT_FUNCTION
- ✅ replaced Boost.Test with Catch2 using an internal version of Catch2 by default but allowing to use an external one
🐛 Bug Fixes:
- 🛠 fixed some incorrect host/device function attributes
- 🛠 fixed warning about comparison unsigned < 0
- There is no need to disable all other backends manually when using
ALPAKA_ACC_GPU_CUDA_ONLY_MODE
anymore - 🛠 fixed static block shared memory of types with alignemnt higher than defaultAlignment
- 🛠 fixed race-condition in HIP/NVCC queue
- 🛠 fixed data races when a GPU updates host memory by aligning host memory buffers always to 4kib
🆕 New Features:
- ➕ Added a new alpaka Logo!
- the whole alpaka code has been relicensed to MPL2 and the examples to ISC
- added
ALPAKA_CXX_STANDARD
CMake option which allows to select the C++ standard to be used - added
ALPAKA_CUDA_NVCC_SEPARABLE_COMPILATION
option to enable separable compilation for nvcc - added
ALPAKA_CUDA_NVCC_EXPT_EXTENDED_LAMBDA
andALPAKA_CUDA_NVCC_EXPT_RELAXED_CONSTEXPR
CMake options to enable/disable those nvcc options (they were always ON before) - ➕ added headers for standalone usage without CMake (
alpaka/standalone/GpuCudaRt.h
, ...) which set the backend defines - ➕ added experimental HIP back-end with using nvcc (HIP >= 1.5.1 required, latest rocRand). More on HIP setup: doc/markdown/user/implementation/mapping/HIP.md
- ➕ added
sincos
math function implementations - 👍 allowed to copy and move construct ViewPlainPtr
- ➕ added support for CUDA atomics using "unsigned long int"
- ➕ added compile-time error for atomic CUDA ops which are not available due to sm restrictions
- ➕ added explicit errors for unsupported types/operations for CUDA atomics
- replaced usages of
assert
withALPAKA_ASSERT
- 👌 replaced
BOOST_VERIFY
byALPAKA_CHECK
and returned success from all test kernels - added
alpaka::ignore_unused
as replacement forboost::ignore_unused
💥 Breaking changes:
- renamed
Queue*Async
toQueue*NonBlocking
andQueue*Sync
toQueue*Blocking
- 📇 renamed
alpaka::size::Size
toalpaka::idx::Idx
,alpaka::size::SizeType
toalpaka::idx::IdxType
(andTSize
toTIdx
internally) - replaced
ALPAKA_FN_ACC_NO_CUDA
byALPAKA_FN_HOST
- replaced
ALPAKA_FN_ACC_CUDA_ONLY
by direct usage of__device__
- renamed
ALPAKA_STATIC_DEV_MEM_CONSTANT
toALPAKA_STATIC_ACC_MEM_CONSTANT
andALPAKA_STATIC_DEV_MEM_GLOBAL
toALPAKA_STATIC_ACC_MEM_GLOBAL
- 📇 renamed
alpaka::kernel::createTaskExec
toalpaka::kernel::createTaskKernel
- QueueCpuSync now correctly blocks when called from multiple threads
** This broke some previous use-cases (e.g. usage within existing OpenMP parallel regions)
** This use case can now be handled with the support for external CPU queues as can bee seen in the exampleQueueCpuOmp2CollectiveImpl
- previously it was possible to have kernels return values even though they were always ignored. Now kernels are checked to always return void
- renamed all files with
*Stl
suffix to*StdLib
- renamed
BOOST_ARCH_CUDA_DEVICE
toBOOST_ARCH_PTX
- executors have been renamed due to the upcoming standard C++ feature with a different meaning. All files within
alpaka/exec/
have been moved toalpaka/kernel/
and the files and classes have been renamed fromExec*
toTaskKernel*
. This should not affect users of alpaka but will affect extensions.
-
v0.3.5 Changes
November 18, 2018 -
v0.3.4 Changes
October 16, 2018Compatibility Changes:
- ➕ added support for boost-1.68.0
- ➕ added support for CUDA 10
- 👌 support for glibc < 2.18 (fix missing macros)
- ➕ added checks for available OpenMP versions
🐛 Bug Fixes:
- 🛠 fixed empty(StreamCpuAsync) returning true even though the last task is still in progress
- 🛠 fixed integer overflows in case of int16_t being used as accelerator index type
- 👍 made some throwing destructors not throwing to support clang 7
- 🛠 fixed broken alpaka::math::min for non-integral types
🆕 New Features:
- ➕ added prepareForAsyncCopy which can be called to enable async copies for a specific buffer (if it is supported)
- 👍 allowed to run alpaka OpenMP 2 block accelerated kernels within existing parallel region
- ➕ added alpaka::ignore_unused which can be used in kernels
-
v0.3.3 Changes
August 10, 2018 -
v0.3.2 Changes
October 16, 2018🐛 Bug fix and compatibility release for version 0.3.1.
🆕 New Features:
- ✨ Enhanced the compiler compatibility checks within the CMake scripts
🐛 Bugs Fixed:
- 🛠 fixed missing error in case of wrong OpenMP thread count being used by the runtime that was not triggered when not in debug mode
- 🛠 fixed CUDA driver API error handling
- 🛠 fixed CUDA memcpy and memset for zero sized buffers (division by zero)
- 🛠 fixed OpenMP 4 execution
- 🛠 fixed the VS2017 CUDA build (not officially supported)
- 🛠 fixed CUDA callback execution not waiting for the task to finish executing
- 🛠 fixed cudaOnly test being part of make test when cuda only mode is not enabled
Compatibility Changes:
- ➕ added support for CUDA 9.2
-
v0.3.1 Changes
June 11, 2018🐛 Bug fix release for version 0.3.0.
🆕 New Features:
- 🏗 CMake: added option to control tests
BUILD_TESTING
- CMake: unified requirement of CMake 3.7.0+
- CMake: used targets for Boost dependencies
- CMake: made alpaka a pure interface library
🐛 Bugs Fixed:
- 🛠 fixed getDevCount documentation
- 🛠 fixed undefined define warnings
- 🛠 fixed self containing header check for CUDA
Compatibility Changes:
- n/a
- 🏗 CMake: added option to control tests
-
v0.3.0 Changes
March 15, 2018🚀 This is the third release of alpaka.
🐛 Bugs Fixed:
- 🛠 fixed multiple bugs where CPU streams/events could deadlock or behaved different than the native CUDA events
- 🛠 fixed a bug where the block synchronization of the Boost.Fiber backend crashed due to uninitialized variables
🆕 New Features / Enhancements:
- ➕ added support for stream callbacks allowing to enqueue arbitrary host code using
alpaka::stream::enqueue(stream, [&](){...});
- added support for compiling for multiple architectures using e.g.
ALPAKA_CUDA_ARCH="20;35"
- added support for using
__host__ constexpr
code within__device__
code - ✨ enhanced the CUDA error handling
- ✨ enhanced the documentation for mapping CUDA to alpaka
Compatibility Changes:
- ➕ added support for CUDA 9.0 and 9.1
- ➕ added support for CMake 3.9 and 3.10
- ✂ removed support for CMake 3.6 and older
- ➕ added support for boost-1.65.0
- ✂ removed support for boost-1.61.0 and older
- ➕ added support for gcc 7
- ➕ added support for clang 4 and 5
- ✂ removed support for VS2015
-
v0.2.0 Changes
June 19, 2017🚀 This is the second release of alpaka. It does not contain huge changes but only compatibility fixes and small enhancements:
- 📚 the documentation has been greatly enhanced
- ➕ adds support for CUDA 8.0
- ➕ adds support for CMake versions 3.6, 3.7 and 3.8
- ➕ adds support for Boost 1.62, 1.63 and 1.64
- ➕ adds support for clang-3.9
- ➕ adds support for Visual Studio 2017
- alpaka now compiles clean even with clang
-Weverything
- 🚀 re-enabled the
boost::fiber
accelerator backend which was disabled in the last release
API changes:
- 🚚
mapIdx
is moved from namespacealpaka::core
toalpaka::idx
- 🚚
Vec
is moved from namespacealpaka
toalpaka::vec
vec::Vec
is now allowed to be zero-dimensional (was previously forbidden)- ➕ added
vec::concat
- ➕ added element-wise
operator<
forvec::Vec
which returns a vector of bool - 👍 CPU accelerators now support arbitrary dimensionality (both kernel execution as well as memory operations)
- ➕ added support for
syncBlockThreadsPredicate
withblock::sync::op::LogicalOr
,block::sync::op::LogicalAnd
andblock::sync::op::Count
- 🚀 memory allocations are now aligned optimally for the underlying architecture (16 bit for SSE, 32 bit for AVX, 64 bit for AVX512) instead of 16 bit for all architectures in the previous release
Thanks to all contributors!
-
v0.1.0 Changes
July 12, 2016