VexCL/CHANGELOG and VexCL Releases

All Versions

Latest Version

1.4.1

Avg Release Cycle

139 days

Latest Release

2547 days ago

Changelog History

v1.4.1 Changes
May 04, 2017
🚀 A bug fix release.
- 👌 Improvements for cmake scripts.
- 🐛 Bug fixes.
v1.4.0 Changes
April 19, 2017
- 👷 Modernize cmake build system.
  Provide VexCL::OpenCL, VexCL::Compute, VexCL::CUDA, VexCL::JIT
  imported targets, so that users may just
  
  add_executable(myprogram myprogram.cpp) target_link_libraries(myprogram VexCL::OpenCL)
🏗 to build a program using the corresponding VexCL backend.
Also stop polluting global cmake namespace with things like
add_definitions(), include_directories(), etc.
👀 See http://vexcl.readthedocs.io/en/latest/cmake.html.
- 👉 Make vex::backend::kernel::config() return reference to the kernel. So
  that it is possible to config and launch the kernel in a single line:
  K.config(nblocks, nthreads)(queue, prm1, prm2, prm3);.
- Implement vector<T>::reinterpret<U>() method. It returns a new vector that
  reinterprets the same data (no copies are made) as the new type.
- Implemented new backend: JIT. The backend generates and compiles at runtime
  👍 C++ kernels with OpenMP support. The code will not be more effective that
  hand-written OpenMP code, but allows to easily debug the generated code with
  ✅ host-side debugger. The backend also may be used to develop and test new code
  when other backends are not available.
- Let VEX_CONSTANTS to be casted to their values in the host code. So that a
  constant defined with VEX_CONSTANT(name, expr) could be used in host code
  as name. Constants are still useable in vector expressions as name().
- 👍 Allow passing generated kernel args for each GPU (#202).
  Kernel args packed into std::vector will be unpacked and passed
  to the generated kernels on respective devices.
- 📜 Reimplemented vex::SpMat as vex::sparse::ell, vex::sparse::crs,
  📜 vex::sparse::matrix (automatically chooses one of the two formats based on
  📜 the current compute device), and vex::sparse::distributed<format> (this one
  may span several compute devices). The new matrix-vector products are now
  normal vector expressions, while the old vex::SpMat could only be used in
  ➕ additive expressions. The old implementation is still available.
  📜 vex::sparse::ell is now converted from host-side CRS format on compute
  device, which makes the conversion faster.
- 🐛 Bug fixes and minor improvements.
v1.3.3 Changes
April 06, 2015
- ➕ Added vex::tensordot() operation. Given two tensors (arrays of dimension greater than or equal to one), A and
  📄 B, and a list of axes pairs (where each pair represents corresponding axes from two tensors), sums the products of A's and B's elements over the given axes. Inspired by python's numpy.tensordot operation.
- 🔦 Expose constant memory space in OpenCL backend.
- Provide shortcut filters vex::Filter::{CPU,GPU,Accelerator} for OpenCL backend.
- Added Boost.Compute backend. Core functionality of the Boost.Compute library is used as a replacement to Khronos C++ API which seems to become more and more outdated. The Boost.Compute backend is still based on OpenCL, so there are two OpenCL backends now. Define VEXCL_BACKEND_COMPUTE to use this backend and make sure Boost.Compute headers are in include path.
v1.3.2 Changes
September 04, 2014
- 👌 Improved thread safety
- Implemented any_of and all_of primitives
- 🛠 Minor bugfixes and improvements
v1.3.1 Changes
May 14, 2014
- Adopted scan_by_key algorithm from HSA-Libraries/Bolt.
- 🛠 Minor improvements and bug fixes.
v1.3.0 Changes
April 14, 2014
- API breaking change: vex::purge_kernel_caches() family of functions is
  📇 renamed to vex::purge_caches() as the online cache now may hold objects of
  arbitrary type. The overloads that used to take
  vex::backend::kernel_cache_key now take const vex::backend::command_queue&.
- The online cache is now purged whenever vex::Context is destroyed. This
  👍 allows for clean release of OpenCL/CUDA contexts.
- Code for random number generators has been unified between OpenCL and CUDA
  backends.
- 👍 Fast Fourier Transform is now supported both for OpenCL and CUDA backends.
- vex::backend::kernel constructor now takes optional parameter with command
  line options.
- 🐎 Performance of CLOGS algorithms has been improved.
- VEX_BUILTIN_FUNCTION macro has been made public.
- 🛠 Minor bug fixes and improvements.
v1.2.0 Changes
April 02, 2014
- API breaking change: the definition of VEX_FUNCTION family of macros has changed. The previous versions are available as VEX_FUNCTION_V1.
- 🔊 Wrapping code for clogs library is added by @bmerry
  🔊 (the author of clogs).
- vector/multivector iterators are now standard-conforming iterators.
- 🛠 Other minor improvements and bug fixes.
v1.1.2 Changes
December 24, 2013
- reduce_by_key() may take several tied keys (see e09d249).
- It is possible to reduce OpenCL vector types (cl_float2, cl_double4, etc).
- 👉 VEXCL_SHOW_KERNELS may be an environment variable as well as a preprocessor macro. This allows to control kernel source output without program recompilation.
- ➕ Added compute capability filter for the CUDA backend (vex::Filter::CC(major, minor)).
- 🛠 Fixed compilation errors and warnings generated by Visual Studio.
v1.1.1 Changes
December 05, 2013
Sorting algorithms may take tuples of keys/values (in fact, any Boost.Fusion sequence will do). One will have to explicitly specify the comparison functor in this case. Both host and device variants of the comparison functor should take 2n arguments, where n is the number of keys. The first n arguments correspond to the left set of keys, and the second n arguments correspond to the right set of keys. Here is an example that sorts values by a tuple of two keys:
```
vex::vector\<int\> keys1(ctx, n); vex::vector\<float\> keys2(ctx, n); vex::vector\<double\> vals (ctx, n);struct { VEX\_FUNCTION(device, bool(int, float, int, float), "return (prm1 == prm3) ? (prm2 \< prm4) : (prm1 \< prm3);" ); bool operator()(int a1, float a2, int b1, float b2) const { return std::make\_tuple(a1, a2) \< std::tuple(b1, b2); } } comp;vex::sort\_by\_key(std::tie(keys1, keys2), vals, comp);
```
v1.1.0 Changes
November 29, 2013
- 👉 vex::SpMat<>class uses CUSPARSE library on CUDA backend when VEXCL_USE_CUSPARSE macro is defined. This results in more effective sparse matrix-vector product, but disables inlining of SpMV operation.
- Provided an example of CUDA backend interoperation with Thrust.
- When VEXCL_CHECK_SIZES macro is defined to 1 or 2, then runtime checks for vector
  👀 expression correctness are enabled (see #81, #82).
- Added sort() and sort_by_key() functions.
- Added inclusive_scan() and exclusive_scan() functions.
- Added reduce_by_key() function. Only works with single-device contexts.
- Added convert_<type>() and as_<type>() builtin functions for OpenCL backend.