How to include SASS code to fat binary for latest GPU not supported by my current old NVCC

I have a deep learning library that I’m building with the old Cuda Toolkit 9.2, and I have a limitation where I cannot change the toolkit version I’m using.

I now need to run my CUDA application on a very new GPU. It’s new enough that its architecture is not one that’s supported by NVCC from Cuda 9.2 . So I end up going through JIT compilation, which takes really long.

Is there a workaround that I can use to include the SASS for the new GPU in the fat binary?

The different GPU architectures are not binary compatible. That means you cannot generate SASS (machine code) for a new architecture with an old toolchain. The designated workaround is the one you are already using: Generate PTX code for the latest version the toolchain supports and have the compiler backend that comes with the driver package JIT-compile that to SASS. This obviously requires a recent enough driver.

I would recommend the PTX-JIT-compile approach only for a transitional period while deployment of the latest toolchain is in progress. What is the specific reason you cannot upgrade to a newer CUDA version? 9.2 is quite old, from 2018. You could always install multiple CUDA versions and switch between them as needed.

What would happen if I swap only the NVCC in Cuda 9.2 with NVCC from, say, Cuda 11.4 and try to include SASS for the latest GPU?

Your guess is at good as mine. I would think there is a reason the toolchain is distributed as part of a toolkit and not as a standalone component that can be mixed & matched at will with the rest of the toolkit components. Even if it would seem to work, such a setup might break at any time, and you would have no support for such Frankenbuild.

Why is it that you are stuck with CUDA 9.2? That might be an easier problem to solve.

Is there a workaround that I can use to include the SASS for the new GPU in the fat binary?

NVIDIA provides no methods to do what you are asking.

What would happen if I swap only the NVCC in Cuda 9.2 with NVCC from, say, Cuda 11.4 and try to include SASS for the latest GPU?

I don’t know. NVIDIA doesn’t support that.