I would like to announce an initial release of a new device-side C++11 library I’ve written -
The CUDA Kernel Author’s Toolkit
It’s anheader-only library which is a loosely-coupled collection of useful functions and classes for writing device-side CUDA code (kernels and non-kernel functions). It’s the result of having repeatedly found myself rewriting the same small bit of code, or copy-pasting files or snippets from one project to another - which did not have to do with the project specifically nor even with the application domain. So, I sat down to round them out into something more respectable and robust which others could also use.
The facilities in this library…
- Make our device-side code less cryptic and idiosyncratic, with clearer naming and semantics.
- Not repeat ourselves as much - the DRY principle.
- Write templated device-side without constantly coming up against not-trivially-templatable bits in CUDA.
- Use standard-library(-like) containers in device-side code (but not have to use them).
- Use less magic numbers.
… while not committing to any particular framework, paradigm or class hierarchy.
Requirements:
- CUDA 8.0 or later.
- Compilation with
--std=c++11
or later standard. (Caveat: Not tested with LLVM’s CUDA support.) - A Linux, Mac or Windows operating system (i.e. if CUDA is usable, then so should this library be).
- You can just copy the headers as-is, but if you want a “proper installation” then you’ll need CMake 3.8.2 and your OS’ build tool.
- Optional: A recent version of the strf library for on-device streams.
Links:
- The current and earlier library releases - for the download links.
- Frequently-Asked Questions about the library.
- Explanatory/motivational wiki pages regarding:
- Facilities improving support for templated kernel authoring.
- More readable and less error-prone grid, block, warp and thread information utilities.
I am an individual independent developer (well, in this context), so I rely to some extent on your - the community’s - support. I’d gotten some, and including quite a bit of useful feedback after announcing my cuda-api-wrappers library here a few years back, so I encourage you to comment/ask here, open issues, write me directly, and try it out.