Originally published at: https://developer.nvidia.com/blog/runtime-fatbin-creation-using-the-nvidia-cuda-toolkit-12-4-compiler/
CUDA Toolkit 12.4 introduced a new nvFatbin library for creating fatbins at runtime. Fatbins, otherwise known as NVIDIA device code fat binaries, are containers that store multiple versions of code to store different architectures. In particular, NVIDIA uses them to bundle code for different GPU architectures, such as sm_61 and sm_90. Until now, to generate…
Creating the nvFatbin library was an interesting challenge! If you have any questions or comments, lets us know!