NVIDIA driver is not loaded. Ubuntu 18.10

I have a desktop running Ubuntu 18.04.5 LTS (with Win10 dual boot)
uname -r = 5.4.0-48-generic
I have two TITAN RTX GPUs for Tensorflow
and the motherboard Intel GPU, which I use for display, confirmed by prime-select query = intel

A year ago I followed an AI/ML article and installed the NVIDA drivers using PPA. All worked well, until it didn’t. I think it may be related to apt-get dist-upgrade, but not sure.
I have tried many solutions from this thread, but no luck.

Current status:
lshw -C display shows both TITANs as UNCLAIMED
secure boot is not enabled
apt install nvidia-driver-450 = nvidia-driver-450 is already the newest version (450.51.06-0ubuntu1)
nvidia-settings = ERROR: NVIDIA driver is not loaded

Any advice would be greatly appreciated…

EDIT: Not smart enough to know where to ‘hover the mouse’ to find the upload paper clip…
Here is a link to the bug report: https://drive.google.com/file/d/1EKsrdwnh2YMgjhTVOcHDQhhbe4ZaRGvA/view?usp=sharing

im having the same problem the nvidia driver is not loaded.

nvidia-bug-report.log.gz (113.8 KB)

Same problem on my Acer Predactor PH317 with GeForce RTX 2070, on Ubuntu 18.

One things which could be of interest: the Secure Boot was not disabled when I installed Ubuntu, I only did it after reading this topic and checking with nvidia-bug-report. I basically tried 100% of the tricks given here, and tried installing with both apt and the nvidia’s website installer.

Some errors I see when grep error nvidia-bug-report.log:

‘/usr/src/nvidia-450.nvidia-bug-report.log.gz (86.0 KB) 80.02/nvidia/nvlink_errors.h’ (No such file or directory)

nvidia-gpu 0000:01:00.3: i2c timeout error e0000000

alx 0000:07:00.0: AER: device [1969:e0b1] error status/mask=00000080/00002000
pcieport 0000:00:1d.0: AER: Corrected error received: 0000:07:00.0

@generix, please help. I’ve done every procedure on this topic, but no luck
nvidia-bug-report.log.gz (81.2 KB)

I’m also experiencing similar issue as the original post.
I have dual boot with windows and secure boot disabled. Some time ago I installed nvidia drivers and it worked normally. nvidia-smi was working.

This week I noticed that nvidia drivers weren’t loaded.
Right now I have driver version 455.28.
nvidia-bug-report.log.gz (115.8 KB)

I have the same issues. Can somebody help me?

The error is this

Nov 17 17:13:17 linux-desktop kernel: [ 370.584621] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
Nov 17 17:13:17 linux-desktop kernel: [ 370.585991] NVRM: Can’t find an IRQ for your NVIDIA card!
Nov 17 17:13:17 linux-desktop kernel: [ 370.585991] NVRM: Please check your BIOS settings.
Nov 17 17:13:17 linux-desktop kernel: [ 370.585992] NVRM: [Plug & Play OS] should be set to NO
Nov 17 17:13:17 linux-desktop kernel: [ 370.585992] NVRM: [Assign IRQ to VGA] should be set to YES
Nov 17 17:13:17 linux-desktop kernel: [ 370.586007] NVRM: The NVIDIA probe routine failed for 1 device(s).
Nov 17 17:13:17 linux-desktop kernel: [ 370.586007] NVRM: None of the NVIDIA devices were initialized.
Nov 17 17:13:17 linux-desktop kernel: [ 370.586649] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237

Hi I’m facing this issue in Ubuntu 20.10. I am using nvidia driver 455
nvidia-bug-report.log.gz (113.3 KB)

I have same problems. I am running Ubuntu 20.04. (Kubuntu)

Basically I can only switch drivers using sudo prime-select nvidia (or intel). Also I cannot switch to the Nouveau driver because I get this message.

I cannot use the nvidia-settings. I get these message like everyone else:

~$ inxi -G
Graphics:  Device-1: Intel 4th Gen Core Processor Integrated Graphics driver: N/A 
           Device-2: NVIDIA GK104M [GeForce GTX 870M] driver: N/A 
           Display: x11 server: X.Org 1.20.8 driver: fbdev unloaded: modesetting,vesa resolution: 2880x1620~91Hz 
           OpenGL: renderer: llvmpipe (LLVM 10.0.0 256 bits) v: 3.3 Mesa 20.0.8 


~$ sudo prime-select query
nvidia

$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.


~$ nvidia-settings

ERROR: NVIDIA driver is not loaded


ERROR: Unable to load info from any available system

Gtk-Message: 21:10:17.277: Failed to load module "appmenu-gtk-module"

(nvidia-settings:42363): GLib-GObject-CRITICAL **: 21:10:17.292: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
** Message: 21:10:17.299: PRIME: No offloading required. Abort
** Message: 21:10:17.299: PRIME: is it supported? no

Also one other item. The nvidia-persistenced service will not start with sudo systemctl start . I get this message,

Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Verbose syslog connection opened
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Now running with user ID 126 and group ID 136
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Started (46805)
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: Failed to query NVIDIA devices. Please ensure that the NVIDIA device files (/dev/nvidia*) exist, and that user 126 has read and write permissions for those files.
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46801]: nvidia-persistenced failed to initialize. Check syslog for more details.
Dec 10 21:24:27 user-GT60-2PC nvidia-persistenced[46805]: PID file unlocked.
Dec 10 21:24:27 user-GT60-2PC systemd[1]: nvidia-persistenced.service: Control process exited, code=exited, status=1/FAILURE


~$ sudo ls -lsa /dev/nvidia*
0 crw-rw-rw- 1 root root 195, 254 Dec 10 19:10 /dev/nvidia-modeset
0 crw-rw-rw- 1 root root 236,   0 Dec 10 19:10 /dev/nvidia-uvm
0 crw-rw-rw- 1 root root 236,   1 Dec 10 19:10 /dev/nvidia-uvm-tools


~$ id 126
uid=126(nvidia-persistenced) gid=136(nvidia-persistenced) groups=136(nvidia-persistenced)

The issue persisted for some days. smi was not workign and lshw showed display unclaimed and tried this procedure and it worked.
As per xorg.conf it was created by driver 450 while i have 460 and so it was created previously and not updated
that brings another issue that cuda and other installations leave behind files.
Got to do a cleanup somehow.
A newbie learning by destroying.

Please help me. Nvidia settings are shoing ut when i run nvidia-smi I get this error:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

The log file is attached:

nvidia-bug-report.log.gz (1005.3 KB)

Your gpu is turned off and the error is flooding the logs so the cause is unknown. Please check if you have bbswitch installed and uninstall it. Please create a new nvidia-bug-report.log instantly after a fresh boot.

Feedback after my first message of (Oct 15, 2020): I’ve successfully installed the NVIDIA and CUDA drivers!

I reinstalled Ubuntu with Secure Boot disable (forgot to do it the 1st time), then followed the instructions found here: nvidia - How do I Install CUDA on Ubuntu 18.04? - Ask Ubuntu

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo ubuntu-drivers autoinstall
sudo reboot

And after reboot:
sudo apt install nvidia-cuda-toolkit gcc-6

Cheers!

1 Like

Hello, I have the same problem as others at Ubuntu 18.04.

I’d read a lot of advices above, ensured that secureboot is off, removed all nvidia stuff and installed driver again, updated kernel to latest, but still have an error with driver installation and running, and I have no idea why.

The errors look like other’s, for example:

$ nvidia-smi 
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

So please, can you help me to find what I’m doing wrong?

nvidia-bug-report.log.gz (126.8 KB)

You have your system compiler set to clang/llvm but gcc 7.5 is needed. Please set your cc back to gcc-7.5 using update-alternatives.

1 Like

Thank you very much for help! Finally it works!

Re-Freedback… some Ubuntu updates broke my Nvidia/CUDA installations.

I remove all the components:

sudo apt remove --purge 'nvidia*' 'cuda*'
sudo apt autoremove --purge

Then followed the official install stapes in here (for “deb(local)”): CUDA Toolkit 11.7 Update 1 Downloads | NVIDIA Developer

 wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
 sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
 wget https://developer.download.nvidia.com/compute/cuda/11.2.1/local_installers/cuda-repo-ubuntu1804-11-2-local_11.2.1-460.32.03-1_amd64.deb
 sudo dpkg -i cuda-repo-ubuntu1804-11-2-local_11.2.1-460.32.03-1_amd64.deb
 sudo apt-key add /var/cuda-repo-ubuntu1804-11-2-local/7fa2af80.pub
 sudo apt-get update
 sudo apt-get -y install cuda

At first the install of cuda failed with a message “cuda depending on cuda” (sorry it’s vague…). That was a problem of conflict between cuda packages. I fixed this with:

sudo apt autoclean

Hopefully this way of installing will be more stable for next Ubuntu updates…

Hello,

Looks like this is one heck of a problem. Same as most of the answers above:

  • Disabled Wayland,
  • blacklisted Nouveau,
  • Disabled the Secure boot,
  • Updated the kernel to the latest stable,
  • Installed the driver using Software & Updates,

and I still have the same problem.
I am using an MSİ GF65 laptop running Ubuntu 18.04.
Here is the bug report.

nvidia-bug-report.log.gz (1.1 MB)

See:
https://forums.developer.nvidia.com/t/rtx-3060-mobile-linux-driver/169531/7?u=generix

1 Like

Same issue on a new Dell laptop with Ubuntu 18.04 installed by Dell. The NVIDIA driver worked a few weeks, then stopped. I suspect a recent Ubuntu update messed up things.

$ nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

  • No /etc/modprobe.d/blacklist-nvidia.conf

  • Tried to remove existing NVIDIA driver, reboot and reinstall driver, and reboot

  • Tried the driver 440, 450 and 460. Nothing worked.

nvidia-bug-report.log.gz (163.6 KB)