Nvidia-smi shows "CUDA Version: N/A"

Nvidia Tesla P100. Installed the drivers according to the instructions and CUDA Toolkit according to the instructions. nvcc - V shows correct CUDA version 11.2, but nvidia-smi doesn’t show any version available.
nvidia-bug-report.log.gz (2.9 MB)

Please check if libcuda is installed:
ls -l /usr/lib/x86_64-linux-gnu/libcuda*
should be in package libcuda1-460

1 Like

No, it is not installed.

lrwxrwxrwx 1 root root       19 Mar 30  2016 /usr/lib/x86_64-linux-gnu/libcudart.so.7.5 -> libcudart.so.7.5.18
-rw-r--r-- 1 root root   383336 Sep 19  2015 /usr/lib/x86_64-linux-gnu/libcudart.so.7.5.18
lrwxrwxrwx 1 root root       12 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so -> libcuda.so.1
lrwxrwxrwx 1 root root       21 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> libcuda.so.418.181.07
-rw-r--r-- 1 root root 16331696 Dec 27 22:02 /usr/lib/x86_64-linux-gnu/libcuda.so.418.181.07

How can I install CUDA (given that I have already installed it according to the instructions)?

There are leftovers from a previous runfile install which blocked the package install.

lrwxrwxrwx 1 root root       12 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so -> libcuda.so.1
lrwxrwxrwx 1 root root       21 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> libcuda.so.418.181.07
-rw-r--r-- 1 root root 16331696 Dec 27 22:02 /usr/lib/x86_64-linux-gnu/libcuda.so.418.181.07

Either run the runfile again with --uninstall option or remove them manually, then reinstall the libcuda package.

1 Like

then reinstall the libcuda package
How to do it? I did remove cuda/cuda-drivers and install cuda/cuda-drivers, nothing changed.

Did you remove the blocking files beforehand?

1 Like

Yes, I deleted the files you listed in the message.

then try
sudo apt install libcuda1
or
sudo apt install libcuda1-460

1 Like
quantum@lorentz:/mnt/store$ sudo apt install libcuda1
[sudo] password for quantum:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'libcuda1-430' instead of 'libcuda1'
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libcuda1-430 : Depends: nvidia-430 (>= 430.64) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
quantum@lorentz:/mnt/store$ sudo apt install libcuda1-460
Reading package lists... Done
Building dependency tree
Reading state information... Done
libcuda1-460 is already the newest version (460.32.03-0ubuntu1).
libcuda1-460 set to manually installed.
The following packages were automatically installed and are no longer required:
  aufs-tools cgroupfs-mount containerd.io docker-ce docker-ce-cli libcublas7.5 libcudart7.5 libcufft7.5 libcufftw7.5 libcuinj64-7.5 libcurand7.5 libcusolver7.5 libcusparse7.5 libnppc7.5 libnppi7.5 libnpps7.5
  libnvblas7.5 libnvidia-container-tools libnvidia-container1 libnvrtc7.5 libnvtoolsext1 libnvvm3 libthrust-dev libvdpau-dev opencl-headers pigz
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 105 not upgraded.

Then use
sudo apt install --reinstall libcuda1-460

1 Like

Thank you a lot!

Please set the package’s flag back to “auto”
sudo apt-mark auto libcuda1-460
so you don’t run into problems when updating.

1 Like