Development using only runtime CUDA compilation (nvrtc) vs compile time CUDA compilation (nvcc)

timothygiraffe · November 26, 2019, 10:40am

My organization produces engineering simulation software with GPU acceleration. We have recently decided to start using CUDA and we are investigating the best way to integrate it into our build system. As i see it we have 2 options:

Compile all of our device code at runtime using the nvrtc library. I think this would involve very few modifications to our build and distribution systems other than adding the necessary new libraries. It would also allow optimisations involving dynamic code generation.
Introduce nvcc into our build system to allow compile time compilation of CUDA code. This would involve some larger changes to our build system, but it might be worth it if it brings advantages for development.

My question is what benefits/disadvantages there would be with adding compile time CUDA compilation (possibly using a mixed approach using nvrtc for some things)? My preliminary thoughts are the following, which might be misguided:

Compile time CUDA compilation allows less boilerplate code for setting up and running gpu kernels, and allows 'nice' features like being able to mix the host and device code in the same file.
Some template libraries like thrust seem to be designed to only work with compile-time CUDA compilation.
The nvrtc library was only introduced with the CUDA Toolkit 7.0 in 2015, whereas CUDA itself has existed since 2007. Most of the examples in the cuda samples do not use nvrtc. Does this suggest that using the compile time method is the 'classic' way to use CUDA and that nvrtc is only meant for some add-on situations like dynamic code generation?
As far as i can tell all of the debugging/profiling tools which work for compile time CUDA compilation should also work when using nvrtc, but just want to be sure of this.

Any thoughts on these issues or other benefits/disadvantages of compile time CUDA compilation would be much appreciated.

Robert_Crovella · November 26, 2019, 3:25pm

You should probably try out both, and learn to use both, before making any decisions about development paths. I would say that nvrtc is noticeably harder to use for many practical examples.

This additional difficulty led to the creation of support systems like jitify:

https://github.com/NVIDIA/jitify

To wit:

"Integrating NVRTC into existing and/or templated CUDA code can be tricky. "

Of course, if you need to do runtime compilation, then nvrtc is a sensible choice.

If it were me, I wouldn’t use nvrtc unless I needed to.

timothygiraffe · November 26, 2019, 3:36pm

Thanks for your reply. I should have said that we almost certainly will need to use nvrtc because we want to support user defined plugin compilation. The problem is more about whether to also add compile time CUDA compilation and what benefits that would give us.

Topic		Replies	Views
Reducing Application Build Times Using CUDA C++ Compilation Aids Technical Blog	1	635	October 31, 2021
Optimizing Compile Times for CUDA C++ Technical Blog	1	28	March 10, 2025
Clang vs. NVCC vs NVRTC which one to use CUDA Programming and Performance	3	1778	August 30, 2022
The CUDA program generated using nvRTC and nvJitLink produces incorrect results CUDA Programming and Performance	4	44	September 3, 2024
NVRTC Runtime Library? CUDA Programming and Performance	0	775	January 26, 2017
Slow compile and cudaMalloc CUDA Programming and Performance	8	3698	February 2, 2011
Google gpucc vs. Nvidia nvcc? CUDA Programming and Performance	8	6604	April 26, 2016
About CUDA portability CUDA Programming and Performance	5	5038	October 26, 2009
CUDA 9.0 nvcc does not support compiling CUDA code inside 32 bit projects on Visual Studio 2015 CUDA Setup and Installation	9	6311	November 7, 2017
conditional nvcc-compiling CUDA Programming and Performance	5	7722	October 1, 2008

Development using only runtime CUDA compilation (nvrtc) vs compile time CUDA compilation (nvcc)

Related topics