lab2022
February 28, 2022, 3:14pm
1
Hi,
I had recently updated my Nvidia drivers (version 510) on my Linux 18.04 laptop. I encountered this error output when launching nvidia-smi :
$nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
As suggested on forums, I have tried in vain to :
check if Secure Boot was disabled
reinstall other drivers versions
remove some blacklists files from /etc/modprobe.d
update initramfs
prime-select Nvidia
reboot after each of those previous steps.
Can anyone help me to fix this error please ?
nvidia-bug-report.log.gz (243.6 KB)
generix
February 28, 2022, 3:25pm
2
Please try to load it manually
sudo modprobe nvidia
and post the errors that are given.
lab2022
February 28, 2022, 3:29pm
3
Thank for your quick response ! Here is the output :
modprobe: ERROR: could not insert ānvidiaā: Invalid argument
generix
February 28, 2022, 3:55pm
4
You switched your system compiler to gcc 8.4 but should be gcc 7.5. Please switch back to the correct gcc version and purge/reinstall the driver afterwards.
1 Like
lab2022
February 28, 2022, 4:21pm
5
You are right. Now, I have changed my gcc version to the older one :
$ gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
And I have done the following steps :
sudo apt-get purge nvidia-*
sudo apt-get update
sudo apt-get autoremov
sudo update-initramfs -u
sudo reboot
sudo apt install nvidia-driver-510
sudo reboot
Unfortunately, I still get the same error with nvidia-smi. Here is the new bug repport.
nvidia-bug-report.log.gz (245.4 KB)
generix
February 28, 2022, 4:35pm
6
According to the logs, it was loading once after being compiled
Feb 28 16:52:21 laptop-monnier kernel: [ 589.888979] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
Feb 28 16:52:21 laptop-monnier kernel: [ 589.888980] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
Feb 28 16:52:21 laptop-monnier kernel: [ 589.890758] [drm] [nvidia-drm] [GPU ID 0x00000100] Unloading driver
Do you still get the same error when loading it manually?
lab2022
February 28, 2022, 4:40pm
7
Excuse me, by āloading it manually,ā do you mean installing the driver with a .run file?
generix
February 28, 2022, 4:40pm
8
Also, please post thee output of
grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*
generix
February 28, 2022, 4:41pm
9
Loading manually
sudo modprobe nvidia
lab2022
February 28, 2022, 4:42pm
10
Thank you, I still have the same output :
$ sudo modprobe nvidia
modprobe: ERROR: could not insert ānvidiaā: Invalid argument
$ grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*
/etc/modprobe.d/nsight-privilege.conf:options nvidia "NVreg_RestrictProfilingToAdminUsers=0
/lib/modprobe.d/nvidia-kms.conf:# This file was generated by nvidia-prime
/lib/modprobe.d/nvidia-kms.conf:options nvidia-drm modeset=1
generix
February 28, 2022, 8:20pm
11
Please blacklist nvidiafb
1 Like
generix
February 28, 2022, 8:23pm
12
Furthermore, the file /etc/modprobe.d/nsight-privilege.conf seems to be corrupt, please remove it.
1 Like
lab2022
February 28, 2022, 9:21pm
13
The nvidia-smi is working now !
So the error came from :
an incorrect version of gcc
a corrupted file and a missing blacklist in the /etc/modprob.d folder
Thank you so much for your great help and your advices !