tl;dr: For your consideration.
Your friendly neighborhood API wrapper here. You may already know about my CUDA Runtime API wrappers library, announced here on the forum a few years back, which has by now gained a bit of popularity.
Well, I’ve been working on also providing wrappers for the functionality only available in the CUDA Driver API, and the NVRTC library, including wrappers for:
- Link processes
- Dynamically-compiled kernels
- Dynamically-compiled programs
- Compilation options
- Linking options
- The primary context
- Virtual memory allocations
and a few others.
But - I don’t want to just replicate the bifrucation we see the usual APIs (or tri-frucation if you consider the NVRTC library for compiling CUDA code at run-time). So this expansion is a Unified single API supporting all functionality available via the driver, the runtime and NVRTC (while staying simple and relatively straightforward!) ; and which you can still just use as wrappers for the Runtime API without worrying about the other stuff.
Now, the coding phase is mostly done. But - there is very little coverage by tests or example programs at this point. Also, the documentation has not been updated yet. So, these widened wrappers are not packaged as a release and are just on a branch of their own.
This is where you (might) come in: I want to invite those of you who write CUDA host-side code directly, whether in an application or in some library or infrastructural layer, to try it out; to consider the design; to give feedback and to report any issues.
PS - The wrappers are written in C++11 to maximize compatibility. At some point I’ll probably move them forward to C++14, and that’s another thing I’m interested in feedback about: “Is it time” in 2020 to drop C++11 in favor of C++14.