Extending the CUDA API wrappers to the driver and NVRTC - testing phase

tl;dr: For your consideration.


Hello all,

Your friendly neighborhood API wrapper here. You may already know about my CUDA Runtime API wrappers library, announced here on the forum a few years back, which has by now gained a bit of popularity.

Well, I’ve been working on also providing wrappers for the functionality only available in the CUDA Driver API, and the NVRTC library, including wrappers for:

  • Contexts
  • Modules
  • Link processes
  • Dynamically-compiled kernels
  • Dynamically-compiled programs
  • Compilation options
  • Linking options
  • The primary context
  • Virtual memory allocations

and a few others.

But - I don’t want to just replicate the bifrucation we see the usual APIs (or tri-frucation if you consider the NVRTC library for compiling CUDA code at run-time). So this expansion is a Unified single API supporting all functionality available via the driver, the runtime and NVRTC (while staying simple and relatively straightforward!) ; and which you can still just use as wrappers for the Runtime API without worrying about the other stuff.

Now, the coding phase is mostly done. But - there is very little coverage by tests or example programs at this point. Also, the documentation has not been updated yet. So, these widened wrappers are not packaged as a release and are just on a branch of their own.

This is where you (might) come in: I want to invite those of you who write CUDA host-side code directly, whether in an application or in some library or infrastructural layer, to try it out; to consider the design; to give feedback and to report any issues.


PS - The wrappers are written in C++11 to maximize compatibility. At some point I’ll probably move them forward to C++14, and that’s another thing I’m interested in feedback about: “Is it time” in 2020 to drop C++11 in favor of C++14.