Nvidia Tesla P100. Installed the drivers according to the instructions and CUDA Toolkit according to the instructions. nvcc - V shows correct CUDA version 11.2, but nvidia-smi doesn’t show any version available.
nvidia-bug-report.log.gz (2.9 MB)
Please check if libcuda is installed:
ls -l /usr/lib/x86_64-linux-gnu/libcuda*
should be in package libcuda1-460
No, it is not installed.
lrwxrwxrwx 1 root root 19 Mar 30 2016 /usr/lib/x86_64-linux-gnu/libcudart.so.7.5 -> libcudart.so.7.5.18 -rw-r--r-- 1 root root 383336 Sep 19 2015 /usr/lib/x86_64-linux-gnu/libcudart.so.7.5.18 lrwxrwxrwx 1 root root 12 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so -> libcuda.so.1 lrwxrwxrwx 1 root root 21 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> libcuda.so.418.181.07 -rw-r--r-- 1 root root 16331696 Dec 27 22:02 /usr/lib/x86_64-linux-gnu/libcuda.so.418.181.07
How can I install CUDA (given that I have already installed it according to the instructions)?
There are leftovers from a previous runfile install which blocked the package install.
lrwxrwxrwx 1 root root 12 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so -> libcuda.so.1
lrwxrwxrwx 1 root root 21 Mar 10 15:44 /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> libcuda.so.418.181.07
-rw-r--r-- 1 root root 16331696 Dec 27 22:02 /usr/lib/x86_64-linux-gnu/libcuda.so.418.181.07
Either run the runfile again with --uninstall option or remove them manually, then reinstall the libcuda package.
then reinstall the libcuda package
How to do it? I did remove cuda/cuda-drivers and install cuda/cuda-drivers, nothing changed.
Did you remove the blocking files beforehand?
Yes, I deleted the files you listed in the message.
then try
sudo apt install libcuda1
or
sudo apt install libcuda1-460
quantum@lorentz:/mnt/store$ sudo apt install libcuda1 [sudo] password for quantum: Reading package lists... Done Building dependency tree Reading state information... Done Note, selecting 'libcuda1-430' instead of 'libcuda1' Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: libcuda1-430 : Depends: nvidia-430 (>= 430.64) but it is not going to be installed E: Unable to correct problems, you have held broken packages. quantum@lorentz:/mnt/store$ sudo apt install libcuda1-460 Reading package lists... Done Building dependency tree Reading state information... Done libcuda1-460 is already the newest version (460.32.03-0ubuntu1). libcuda1-460 set to manually installed. The following packages were automatically installed and are no longer required: aufs-tools cgroupfs-mount containerd.io docker-ce docker-ce-cli libcublas7.5 libcudart7.5 libcufft7.5 libcufftw7.5 libcuinj64-7.5 libcurand7.5 libcusolver7.5 libcusparse7.5 libnppc7.5 libnppi7.5 libnpps7.5 libnvblas7.5 libnvidia-container-tools libnvidia-container1 libnvrtc7.5 libnvtoolsext1 libnvvm3 libthrust-dev libvdpau-dev opencl-headers pigz Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 105 not upgraded.
Then use
sudo apt install --reinstall libcuda1-460
Thank you a lot!
Please set the package’s flag back to “auto”
sudo apt-mark auto libcuda1-460
so you don’t run into problems when updating.