Ncu failed with Found no NVIDIA driver on your system

zeyu-chen · November 23, 2024, 12:02am

Hi
I have a hard time running ncu on my system:

==PROF== Connected to process 43447 (/home/zeyu.chen/.cache/bazel/_bazel_zeyu.chen/6038ac9fefa83b1b010a60cd411239e6/external/python3_x86_64/bin/python3.9)
Traceback (most recent call last):
  File "/home/zeyu.chen/development/github.robot.car/cruise/cruise/develop/build/bin/cruise/mlp/robotorch2/experimental/benchmark_liger_kernels.runfiles/cruise_ws/cruise/mlp/robotorch2/experimental/benchmark_liger_kernels_exedir/__main__.py", line 126, in <module>
    main()
  File "/home/zeyu.chen/development/github.robot.car/cruise/cruise/develop/build/bin/cruise/mlp/robotorch2/experimental/benchmark_liger_kernels.runfiles/cruise_ws/cruise/mlp/robotorch2/experimental/benchmark_liger_kernels_exedir/__main__.py", line 122, in main
    exec(ast, clean_globals)
  File "/home/zeyu.chen/development/github.robot.car/cruise/cruise/develop/build/bin/cruise/mlp/robotorch2/experimental/benchmark_liger_kernels.runfiles/cruise_ws/cruise/mlp/robotorch2/experimental/benchmark_liger_kernels_exedir/cruise/mlp/robotorch2/experimental/benchmark_liger_kernels.py", line 15, in <module>
    input_data = torch.randn(batch_size, seq_len, hidden_dim).to("cuda")
  File "/home/zeyu.chen/.../torch/cuda/__init__.py", line 314, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
==PROF== Disconnected from process 43447
==ERROR== The application returned an error code (1).

However I am able to run nsys with the program. How can I debug this?

veraj · November 25, 2024, 2:29am

Hi, @zeyu-chen

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from Download The Latest Official NVIDIA Drivers

---------Can you make sure your app can work properly without ncu ?

zeyu-chen · November 25, 2024, 2:57am

Yeah I am pretty sure it can. And I can even run nsys with my program.

veraj · November 25, 2024, 3:04am

That’s strange. Because this error is not reported by ncu actually.

Can you please check if this is sample specific, I mean, can you try another simple CUDA sample to see if ncu works ?

zeyu-chen · November 25, 2024, 7:27pm

My program is not CUDA sample. I tried git cloning cuda samples and it does work.

My program is a shell script generated by bazel to run python(PyTorch). I actually tried to inject ncu directly in the execution command:

exec "env" "${env_vars[@]}" "LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH" "${gdb_command[@]}" ncu "${python_command[@]}" "$@"

but it still failed.

One thing I observed is if I remove /usr/lib/x86_64-linux-gnu above, it was ncu can’t detect the driver, maybe I am doing it wrong to link the library? I guess the issue is related to some env var setup.

zeyu-chen · November 26, 2024, 5:00pm

Any recommendation on how to debug this? Do you need more logging to further narrow down the issue?

veraj · November 27, 2024, 3:41am

Hi, @zeyu-chen

There is no specific env setting required before running ncu.
If ncu ./sample (CUDA sample） works, then it means there is no issue with ncu.
Would you please check if there are some specific ENV setting cause the failure ?

zeyu-chen · November 27, 2024, 7:25am

Would you please check if there are some specific ENV setting cause the failure ?

Just looking at $ENV ? I am not familiar with the build system, most of my co workers are out.

I am curious how ncu looks for the coda driver when initializing, it seems like my env at least disturb the ncu startup. Should I collect some nsys log for you to take a look?

zeyu-chen · December 7, 2024, 1:39am

I just noticed there is an option injection-path-64, should I use it to launch my app?

veraj · December 9, 2024, 1:30am

Hi, @zeyu-chen

This still seems a ENV set up issue. Please check details in Nsight Compute failed to connect to the CUDA driver (stub libcuda.so[.1] on path?). This seems a similar issue.

veraj · March 1, 2025, 12:00am

This topic was automatically closed after 10 hours. New replies are no longer allowed.

Topic		Replies	Views
On 5060ti，Ncu does not detect kernels, ==ERROR== The application returned an error code (11) Nsight Compute kernel , profiling	3	14	March 30, 2026
Ncu do not execute application Nsight Compute	3	695	April 30, 2024
Got errors with 2 things: nsys - ncu Visual Profiler and nvprof	3	852	July 27, 2023
Unknown Error on device 0 when runing ncu Nsight Compute docker , cudnn , wsl , ampere	7	455	March 1, 2025
Run Nsight compute command in ubuntu 20.04 Nsight Compute cuda	3	948	August 22, 2022
Ncu problems Nsight Compute	6	1005	December 3, 2022
No kernel to profile when using nsight compute Nsight Compute cuda	8	1970	August 9, 2023
Why Nvidia-Cuda drivers never worked in my Ubuntu? CUDA on Windows Subsystem for Linux	1	1357	July 17, 2023
NSight Compute not finding kernels Nsight Compute	27	1156	March 10, 2026
Nsight Compute failed to connect to the CUDA driver (stub libcuda.so[.1] on path?) Nsight Compute	6	475	December 20, 2024

Ncu failed with Found no NVIDIA driver on your system

Related topics