I have a desktop running Ubuntu 18.04.5 LTS (with Win10 dual boot) uname -r = 5.4.0-48-generic
I have two TITAN RTX GPUs for Tensorflow
and the motherboard Intel GPU, which I use for display, confirmed by prime-select query = intel
A year ago I followed an AI/ML article and installed the NVIDA drivers using PPA. All worked well, until it didn’t. I think it may be related to apt-get dist-upgrade, but not sure.
I have tried many solutions from this thread, but no luck.
Current status: lshw -C display shows both TITANs as UNCLAIMED
secure boot is not enabled apt install nvidia-driver-450 = nvidia-driver-450 is already the newest version (450.51.06-0ubuntu1) nvidia-settings = ERROR: NVIDIA driver is not loaded
Same problem on my Acer Predactor PH317 with GeForce RTX 2070, on Ubuntu 18.
One things which could be of interest: the Secure Boot was not disabled when I installed Ubuntu, I only did it after reading this topic and checking with nvidia-bug-report. I basically tried 100% of the tricks given here, and tried installing with both apt and the nvidia’s website installer.
Some errors I see when grep error nvidia-bug-report.log:
‘/usr/src/nvidia-450.nvidia-bug-report.log.gz (86.0 KB) 80.02/nvidia/nvlink_errors.h’ (No such file or directory)
I’m also experiencing similar issue as the original post.
I have dual boot with windows and secure boot disabled. Some time ago I installed nvidia drivers and it worked normally. nvidia-smi was working.
This week I noticed that nvidia drivers weren’t loaded.
Right now I have driver version 455.28. nvidia-bug-report.log.gz (115.8 KB)
Nov 17 17:13:17 linux-desktop kernel: [ 370.584621] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
Nov 17 17:13:17 linux-desktop kernel: [ 370.585991] NVRM: Can’t find an IRQ for your NVIDIA card!
Nov 17 17:13:17 linux-desktop kernel: [ 370.585991] NVRM: Please check your BIOS settings.
Nov 17 17:13:17 linux-desktop kernel: [ 370.585992] NVRM: [Plug & Play OS] should be set to NO
Nov 17 17:13:17 linux-desktop kernel: [ 370.585992] NVRM: [Assign IRQ to VGA] should be set to YES
Nov 17 17:13:17 linux-desktop kernel: [ 370.586007] NVRM: The NVIDIA probe routine failed for 1 device(s).
Nov 17 17:13:17 linux-desktop kernel: [ 370.586007] NVRM: None of the NVIDIA devices were initialized.
Nov 17 17:13:17 linux-desktop kernel: [ 370.586649] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
Also one other item. The nvidia-persistenced service will not start with sudo systemctl start . I get this message,
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Verbose syslog connection opened
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Now running with user ID 126 and group ID 136
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Started (46805)
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Failed to query NVIDIA devices. Please ensure that the NVIDIA device files (/dev/nvidia*) exist, and that user 126 has read and write permissions for those files.
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46801]: nvidia-persistenced failed to initialize. Check syslog for more details.
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: PID file unlocked.
Dec 10 21:24:27 user-GT60-2PC systemd[1]: nvidia-persistenced.service: Control process exited, code=exited, status=1/FAILURE
~$ sudo ls -lsa /dev/nvidia*
0 crw-rw-rw- 1 root root 195, 254 Dec 10 19:10 /dev/nvidia-modeset
0 crw-rw-rw- 1 root root 236, 0 Dec 10 19:10 /dev/nvidia-uvm
0 crw-rw-rw- 1 root root 236, 1 Dec 10 19:10 /dev/nvidia-uvm-tools
~$ id 126
uid=126(nvidia-persistenced) gid=136(nvidia-persistenced) groups=136(nvidia-persistenced)
The issue persisted for some days. smi was not workign and lshw showed display unclaimed and tried this procedure and it worked.
As per xorg.conf it was created by driver 450 while i have 460 and so it was created previously and not updated
that brings another issue that cuda and other installations leave behind files.
Got to do a cleanup somehow.
A newbie learning by destroying.
Your gpu is turned off and the error is flooding the logs so the cause is unknown. Please check if you have bbswitch installed and uninstall it. Please create a new nvidia-bug-report.log instantly after a fresh boot.
Hello, I have the same problem as others at Ubuntu 18.04.
I’d read a lot of advices above, ensured that secureboot is off, removed all nvidia stuff and installed driver again, updated kernel to latest, but still have an error with driver installation and running, and I have no idea why.
The errors look like other’s, for example:
$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
So please, can you help me to find what I’m doing wrong?
At first the install of cuda failed with a message “cuda depending on cuda” (sorry it’s vague…). That was a problem of conflict between cuda packages. I fixed this with:
sudo apt autoclean
Hopefully this way of installing will be more stable for next Ubuntu updates…
Same issue on a new Dell laptop with Ubuntu 18.04 installed by Dell. The NVIDIA driver worked a few weeks, then stopped. I suspect a recent Ubuntu update messed up things.
$ nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
No /etc/modprobe.d/blacklist-nvidia.conf
Tried to remove existing NVIDIA driver, reboot and reinstall driver, and reboot
Tried the driver 440, 450 and 460. Nothing worked.