Hello fellow CUDA’ers,
Changes since 0.6.3, announced in May:
- Support for registering read-only memory regions
- Improved sanity/validity checks in debug mode
- Using an extra opportunity to check for errors after kernel launchs
- Build system tweaks
- Avoiding some warnings with newer compilers
- CMake versions 3.25-3.28 (and hopefully later) supported
- Windows: Building with the profiling header, use of cooperative groups
- Some minor bug fixes
As usual, and since I’m the sole developer - user feedback is very much appreciated, either as issues you can file or a simple email with questions, suggestions, or just sharing the experiences of using the wrappers on some project.
In particular, I am almost done with my work on wrapping the CUDA Execution Graph API, which is yet unreleased; and that needs beta-testing and feedback the most.
Also, I have some design dilemmas I’m pondering in the issues page, so feel free to chime in about those.