CUDA C++ Compiler Updates Impacting ELF Visibility and Linkage

Originally published at: CUDA C++ Compiler Updates Impacting ELF Visibility and Linkage | NVIDIA Technical Blog

In the next CUDA major release, CUDA 13.0, NVIDIA is introducing two significant changes to the NVIDIA CUDA Compiler Driver (NVCC) that will impact ELF visibility and linkage for global functions and device variables. These updates aim to prevent subtle runtime errors that have long been challenging to detect and debug. However, these changes may…

Will this release of CUDA support gcc or clang on Windows as requested here here 18 years ago to be able to compile CUDA packages for msys2 or will nvcc.exe be still hard locked to cl.exe?

For this case, do we have any solution other than using the flags? Maybe adding this flag to retain the default behavior still carries risks, doesn’t it?

//first.cu
template <typename T>
__global__ void foo() { }
template
__global__ void foo<int>(); // explicit instantiation

// second.cu
template <typename T>
__global__ void foo(); // explicit instantiation in first.cu
int main() { foo<int><<<1,1>>>(); cudaDeviceSynchronize(); }