[checkMacros.cpp::catchCudaError::272] Error Code 1: Cuda Runtime (CUDA driver is a stub library)

Robert_Crovella · March 3, 2022, 2:40am

You have a corrupted install of some sort.

There is a file called libcuda.so that is in a place it is not supposed to be. This file should be in two places:

Wherever the GPU driver install put it. This is the proper one to use. No I can’t be real specific here, because the actual location of this file varies depending on your OS (and I don’t happen to have the install locations memorized for Ubuntu 18.04). And this might actually be two locations, one corresponding to 32-bit usage and one corresponding to 64-bit usage. For example, on my fresh load of CUDA 11.6.1 on a fresh load of CentOS 7, I find that the GPU driver installer has placed it in /usr/lib (the 32-bit location) and /usr/lib64 (the 64-bit location).
In /usr/local/cuda/lib64/stubs. This is one that should only be used for linking purposes and should never be discovered by the runtime loader.

I can think of two options:

use a utility like sudo find / -name libcuda.so to locate every single instance of that file on your machine. Remove any that don’t fit the description above.
Remove all aspects of CUDA and GPU driver from your machine, and do a complete reload.

If the machine is a horrible mess, option 2 might really only be achievable by doing a disk wipe and OS reload, first. If option 1 doesn’t seem to work for some reason, then the only suggestion I have left is option 2.

And by all means, make sure that at no point does your LD_LIBRARY_PATH env var include the path /usr/local/cuda/lib64/stubs. And by all means, don’t copy the stub version of libcuda.so anywhere. You shouldn’t ever copy or symlink to libcuda.so under any circumstances.

Also note that it generally should not be necessary to have the GPU driver install location on your LD_LIBRARY_PATH variable. The runtime loader is usually already configured (e.g. by ldconfig or similar) to look in the location that the GPU driver installer places it.

Finally, I note that you have installed pytorch via anaconda. If anaconda has done something I am unfamiliar with or unexpected in your conda environment, then you might still run into trouble here. I don’t think this should be the case. When running things from a python/conda environment, a conclusive read of the LD_LIBRARY_PATH variable can only be ascertained using the method I already gave, which you don’t seem to have done. You don’t seem to have given a directed response to my last posting.

Topic		Replies	Views
Cuda failure: CUDA driver version is insufficient for CUDA runtime version TensorRT tensorrt , cuda	8	2706	October 12, 2021
Cannot install cuda CUDA Setup and Installation	9	4976	June 27, 2024
Unable to run ONNX runtime with TensorRT execution provider on docker based on NVidia image CUDA Setup and Installation	4	7696	June 22, 2022
Cannot run any CUDA kernels CUDA runtime doesn't recognize NVIDIA GPU CUDA Programming and Performance	26	12383	August 24, 2010
Convert onnx to int 8 trt engine TensorRT	2	682	November 1, 2023
error compiling SDK - "/usr/bin/ld: cannot find -lcuda" CUDA Programming and Performance	18	28566	July 30, 2010
CUDA 9.0 ImportError: libcublas.so.8.0 CUDA Setup and Installation	17	39521	January 22, 2018
nvcc fatal : Value 'sm_52' is not defined for option 'gpu-architecture' CUDA Setup and Installation	19	15829	June 5, 2015
Installation failures(?) despite instructions CUDA Setup and Installation	16	3295	January 26, 2018
Dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; cuDNN	10	7488	October 12, 2021

[checkMacros.cpp::catchCudaError::272] Error Code 1: Cuda Runtime (CUDA driver is a stub library)

Related topics