Fedora 25, CUDA 9 missing libdevice files

svennevs · November 3, 2017, 10:37am

I’ve done both network and local installations, but the only file I receive is /usr/local/cuda/nvvm/libdevice/libdevice.10.bc. NVCC has no problem (it seems) with this and compiling e.g. sm_35. LLVM needs these files though.

How can I obtain these files? Are they even operating system specific? The only other thing on the forums I could find with this were somebody having (the same) issues with pycuda, but they went unanswered.

Given that multiple reinstallations produce the same result (only one libdevice file), how can I obtain the libdevice files for other architectuers for CUDA 9 and Fedora 25? FC23 and CUDA 8 installed them, and I’ve got that backed up. Is it safe to just copy these files?

Thank you for any advice

svennevs · November 4, 2017, 12:55am

So to clarify, on Fedora 23 with CUDA 8, the installation produced

/usr/local/cuda-8.0/nvvm/libdevice> ls -1
libdevice.compute_20.10.bc
libdevice.compute_30.10.bc
libdevice.compute_35.10.bc
libdevice.compute_50.10.bc

Does CUDA 9 put all of them in a single libdevice file? Does anybody with say Ubuntu have CUDA 9 installed, and do they have the same single /usr/local/cuda/nvvm/libdevice/libdevice.10.bc file, or multiple? It seems likely that the rpm may just be missing these files. But I don’t know what to do…

caspar.vanleeuwen · January 22, 2018, 4:42pm

Hi svennevs,

I ran into the same issue when trying to run numba (a Python package that calls the llvm compiler to compile CUDA kernels). It seems that indeed CUDA only produces 1 device file by design. There is this change

https://github.com/llvm-mirror/clang/commit/6d4cb407f117d080e04cf6b8ca200ee01c7b502f

in the LLVM repository which suggests that changes were made in September 2017 to add CUDA 9 support for CLANG. One of those changes involves the file “lib/Driver/ToolChains/Cuda.cpp” where they added the following code path:

if (Version == CudaVersion::CUDA_90) {
+      // CUDA-9 uses single libdevice file for all GPU variants.
+      std::string FilePath = LibDevicePath + "/libdevice.10.bc";
+      if (FS.exists(FilePath)) {
+        for (const char *GpuArch :
+             {"sm_20", "sm_30", "sm_32", "sm_35", "sm_50", "sm_52", "sm_53",
+              "sm_60", "sm_61", "sm_62", "sm_70"})
+          LibDeviceMap[GpuArch] = FilePath;
+      }

There’s just one remaining question I haven’t figured out, which is if a version including this changed has been released yet… If I find out, I’ll let you know.

svennevs · January 22, 2018, 9:17pm

Interesting. So yeah I gave up on this a while ago, I don’t remember what tools I was using but I did end up reading through the files and concluded that libdevice.10.bc was not a “one size fits all”. The naming, as well as the LLVM IR code seem to indicate this.

That said, this is well beyond my realm of understanding. Hopefully the LLVM folks have some kind of backdoor with NVIDIA. Presumably this is why these files even existed in the first place (it doesn’t seem like NVCC needs them at all).

You can install LLVM from the head of the repo if you want to use it, just make sure that you compile in Release mode. The debugging symbols for LLVM make the build take exceptionally longer, as well as the install size is drastically different. Like the difference between say 4GB and 20GB!