Nsight Compute failed to connect to the CUDA driver (stub libcuda.so[.1] on path?)

cmd:
/usr/local/NVIDIA-Nsight-Compute/ncu --set full -k compute_attn xxxx

output:
==ERROR== Nsight Compute failed to connect to the CUDA driver (stub libcuda.so[.1] on path?).
==ERROR== The application returned an error code (1).

version:
I tried 2024.1/2/3 and they report same error.
NVIDIA-SMI 555.42.02 Driver Version: 555.42.02 CUDA Version: 12.5
GPU:A10

I tried to set LD_LIBRARY_PATH/PATH/CUDA_TOOLKIT_PATH to include the path of libcuda.so but didn’t work. What else can I do now?

Hi, @lingxingyu.lxy

Sorry for the issue you met.
Can you confirm if this happens to other CUDA sample or your specific sample?
Also can your sample run successfully without ncu ?

  1. I just tried some samples from cuda-samples and ncu worked fine. My program is a tensorflow unittest.
  2. yes.

I solved this problem.
FYI, here is what I did:
There’s two cuda library path in my env, “/usr/local/nvidia/lib64” and “/usr/local/cuda/lib64”. Let’s call them PATH_A and PATH_B.
In PATH_A we have the libcuda.so we need, and PATH_B contains the stub libcuda.so. So I add PATH_A to LD_LIBRARY_PATH, and link necessary libs in PATH_A.

sudo ln -s /usr/local/cuda/lib64/libnvJitLink.so.12 /usr/local/nvidia/lib64/libnvJitLink.so.12
sudo ln -s /usr/local/cuda/lib64/libcusparse.so.12 /usr/local/nvidia/lib64/libcusparse.so.12

export LD_LIBRARY_PATH=/usr/local/cuda/lib64

Then it works.

Thanks for the sharing !

Just curious what’s env you are using ?
Because in our ENV, we don’t have /usr/local/nvidia path.

It’s an inner-build multi-platform docker which includes both compile env and runtime env.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.