Hey team.
It seems a really common question in Ubuntu 18.04 and recent drivers this sort of problem (mostly for notebooks with two graphic boards):
$ nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Could you please provide some advice on how to solve this? I’m trying to solve this for days, without success.
Things I did:
Installation following https://gist.github.com/Mahedi-61/2a2f1579d4271717d421065168ce6a73#file-cuda_10-0_installation_on_ubuntu_18-04 steps, but for CUDA 10.2
Disabled Secure Boot
Updated the initrd (sudo update-initramfs -u)
Additional info:
$ dkms status
bbswitch, 0.8, 5.0.0-36-generic, x86_64: installed
nvidia, 440.33.01: added
nvidia-bug-report.log.gz (130 KB)
generix
December 24, 2019, 9:25am
2
Please try this:
remove the .run installer driver using the --uninstall option
purge anything nvidia/cuda
delete /etc/X11/xorg.conf
add the ubuntu graphics ppa https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa
install the driver from there (sudo apt install nvidia-driver-440)
make sure nvidia-prime is installed (sudo apt install --reinstall nvidia-prime)
switch to nvidia (sudo prime-select nvidia)
remove stray blacklist files (sudo rm /lib/modprobe.d/blacklist-nvidia.conf /etc/modprobe.d/blacklist-nvidia.conf)
update the initrd (sudo update-initramfs -u)
reboot
Afterwards, don’t install ‘cuda’ but only the toolkit, cuda-toolkit-10-2 or use the cuda .run installer and skip the driver installation.
Hey @generix , everything is working flawless now. Many thanks!
I wasn’t able to remove the .run installer using the --uninstall option, but with the following commands, everything worked.
sudo apt-get purge nvidia*
sudo apt remove nvidia-*
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt-get autoremove && sudo apt-get autoclean
sudo rm -rf /usr/local/cuda*