All Versions
Latest Version
Avg Release Cycle
21 days
Latest Release
1376 days ago

Changelog History

  • v0.4 Changes

    October 14, 2020

    Main changes since 0.3.3:

    • The runtime API wrappers are now a header-only library.
    • Split the NVTX wrappers and the Runtime API wrappers into two separate libraries.
    • Added several fundamental types which were implicit in previous versions: cuda::size_t, cuda::dimensionality_t.

    Minor API tweaks:

    • ๐Ÿ“‡ Renamed launch -> enqueue_launch
    • โฑ Can now schedule managed memory region attachment on streams
    • Now wrapping cudaMemAdvise() advice.
    • Array copying uses typed pointers
    • Added: A cuda::managed::device_side_pointer_for() standalone function
    • โž• Added: A container facade for the sequence of all devices, so you can now write for (auto device : cuda::devices() ) { }.
    • De-templatized: device setter RAII class
    • โž• Added: a freestanding cuda::synchronize() function instead of some wrapper methods
    • Made some type definitions from inside device_t to the device:: namespace
    • โž• Added: A subclass of memory::region_t for managed memory
    • Using memory::region_t in more API functions
    • Dropped cuda::kernel::maximum_dynamic_shared_memory_per_block().
    • Centralized the definitions of take_ownership and do_not_take_ownership
    • Made stream_t& parameters into const stream_t&, almost universally.

    ๐Ÿ› Bug fixes:

    • Cross-device waiting on events
    • ๐Ÿ›  Error message fixes
    • 0๏ธโƒฃ Not assuming the uintNN_t types are in the default namespace

    ๐Ÿ— Build, compatibility, usability:

    • ๐Ÿ›  Fix support for CMake 3.8 (CMakeLists.txt was using some post-3.8 features)
    • Clang-related:
      • Skipping examples which clang++ doesn't support yet (need
      • Only enabling separable compilation and CUDA
      • const-cast'ing const void * kernel function pointers before reinterpretation - clang wont'tt let it
      • GNU extension dropped when compiling examples with CUDA (clang dioesn't support ths)
      • Fixed std::max() call issue
    • CMake targets depending on the wrappers should now have a C++11 language standard requirement for compilation
    • The wrappers now assert C++11 or later is used, instead of letting you just fail somewhere.
  • v0.4.rc1

    October 14, 2020
  • v0.4.rc

    October 14, 2020
  • v0.3.3 Changes

    July 20, 2020

    ๐Ÿš€ This release includes both significant additions to the coverage by the wrappers, as well as major changes to the existing wrappers API.

    Main changes since 0.2.0:

    • Forget about numeric handles! The wrapper classes no longer take numeric handles as parameters, in methods exposed to the user. You'll be dealing with device_t's, event_t's, stream_t's etc. - not device::id_t, device::stream_t and device::event_t's.
    • Wrappers classes no longer templated. That means, on one hand, you don't have to worry about the template argument of "do we assume the wrapper's device is the current one?" ; but on the other hand, every use of the wrapper will set the current device (even if it's already the right one). A lot of code was simplified or even remoed thanks to this change.
    • device_function_t is now named kernel_t , as only kernels are acceptable by the CUDA Runtime API calls mentioning "device functions". Also, kernel_t's are now a pair of (kernel, device), as the settings which can be made for a kernel are mostly/entirely device-specific.
    • ๐Ÿšš The examples CMakeLists.txt has been split off from the main CMakeFiles.txt and moved into a subdirectory, removing any dependencies it may have.
    • Kernel launching now uses perfect forwarding of all parameters.
    • ๐Ÿ‘ป The library is now almost completely header-only. The single exception to this rule is profiling-related code. If you don't use it - the library is header-only for you.
    • ๐Ÿ”„ Changed my email address in the code...

    Main additions since 0.2.0:

    • ๐Ÿ‘ 2D and 3D Array support.
    • ๐Ÿ‘ 2D and 3D texture support.
    • A single set() and get() for all memory spaces.

    ๐Ÿ›  Plus a few bug fixes, and another example program from the CUDA samples.

    ๐Ÿ”„ Changes from 0.3.0:

    • ๐Ÿ›  Fixed: Self-recursion in one of the memory allocation functions.
    • ๐Ÿ›  Fixed: Added missing inline specifiers to some functions
    • White space tweaks
  • v0.3.2

    June 28, 2020
  • v0.3.1

    June 11, 2020
  • v0.3.0

    June 08, 2020