Cuda on Linux. I have a Cuda file with a very large function. Compiling that Cuda code dominates my build time when I’m able to run “make -j 8” so make can run 8 g++ compiles at a time for the rest of my program. I set up nvcc with the standard flags
to support multiple GPU architectures. But this results in one nvcc compile, which then generates the code for these four architectures one at a time. Is there any way to tell nvcc to run these four code generations in parallel, given I have enough CPU cores available?