NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running

I would like to know how to solve this problem. Could anybody help me.

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post. You will have to rename the file ending to something else since the forum software doesn’t accept .gz files (nifty!).

nvidia-bug-report.log (75.4 KB)

thx!

You have changed the system compiler to gcc 4.8, needed is gcc 7.5 please run

sudo update-alternatives --set gcc "/usr/bin/gcc-7.5"
sudo update-alternatives --set g++ "/usr/bin/g++-7.5"
sudo update-alternatives --set cc /usr/bin/gcc
sudo update-alternatives --set c++ /usr/bin/g++
sudo dkms install nvidia/440.82 --all

after reboot, please post the output of
dkms status

hi,

the status is

nvidia, 440.82: added

Please create a new nvidia-bug-report.log and post the output of
cc --version

the output of cc --version:
nvidia-bug-report.log (82.0 KB)
cc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Driver is copiled but seems to be blacklisted. Please run
sudo rm /lib/modprobe.d/blacklist-nvidia.conf /etc/modprobe.d/blacklist-nvidia.conf
sudo update-initramfs -u
and reboot, then post the output of
lsmod |grep nvidia
and
grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*

after running sudo rm /lib/modprobe.d/blacklist-nvidia.conf /etc/modprobe.d/blacklist-nvidia.conf
, the console outputs
rm: cannot remove ‘/etc/modprobe.d/blacklist-nvidia.conf’: No such file or directory

i think there is nothing to remove and i have trid sudo update-initramfs -u before, so i didnt reboot.
and the output of lsmod |grep nvidia:
i2c_nvidia_gpu 16384 0

output of grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*:
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
/etc/modprobe.d/virtualgl.conf:options nvidia NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=1001 NVreg_DeviceFileMode=0660
/lib/modprobe.d/nvidia-kms.conf:# This file was generated by nvidia-prime
/lib/modprobe.d/nvidia-kms.conf:options nvidia-drm modeset=1

If there was nothing to remove, two errors would have been displayed. Please run
sudo update-initramfs -u
also, please add kernel parameter
nogpumanager
and reboot.

sorry my mistake.

It seems it work. But as i ran nvidia-smi, the output is F
ailed to initialize NVML: Insufficient Permissions
When i ran sudo nvidia-smi, it successes.

Now the output of lsmod |grep nvidia is
nvidia_uvm 942080 0
nvidia_drm 49152 4
nvidia_modeset 1114112 2 nvidia_drm
nvidia 20463616 163 nvidia_uvm,nvidia_modeset
drm_kms_helper 180224 2 nvidia_drm,i915
drm 491520 8 drm_kms_helper,nvidia_drm,i915
ipmi_msghandler 102400 2 ipmi_devintf,nvidia
i2c_nvidia_gpu 16384 0

grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*:

/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
/etc/modprobe.d/virtualgl.conf:options nvidia NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=1001 NVreg_DeviceFileMode=0660
/lib/modprobe.d/nvidia-kms.conf:# This file was generated by nvidia-prime
/lib/modprobe.d/nvidia-kms.conf:options nvidia-drm modeset=1

Thank you for ur help. I appreciate it.

Having to run nvidia-smi as root once is to be able to load the needed nvidia-uvm kernel module (which can be done by root only) for cuda. To get around it, please install the package
nvidia-modprobe
which lets normal users load nvidia-uvm

Hi, i have installed nvidia-modprobe

but still got this output of running nvidia-smi: Failed to initialize NVML: Insufficient Permissions.

odd. Does
nvidia-modprobe -u && nvidia-smi
work when run as unpriviledged user?

i have tried it. But i got the same result: Failed to initialize NVML: Insufficient Permissions.

Please check which group is set at the nvidia /dev nodes:
ls -l /dev/nvid*
and make sure your user is in the same group.

crw-rw---- 1 root vglusers 195, 0 Jun 5 15:34 /dev/nvidia0
crw-rw---- 1 root vglusers 195, 1 Jun 5 15:34 /dev/nvidia1
crw-rw---- 1 root vglusers 195, 255 Jun 5 15:34 /dev/nvidiactl
crw-rw---- 1 root vglusers 195, 254 Jun 5 15:34 /dev/nvidia-modeset
crw-rw-rw- 1 root root 239, 0 Jun 5 15:34 /dev/nvidia-uvm
crw-rw-rw- 1 root root 239, 1 Jun 5 15:34 /dev/nvidia-uvm-tools

/dev/nvidia-uvm* are set as root. Is this that reason?

No, nvidia-uvm has rw right set for all users but the other nodes don’t. Your user needs to be in the vglusers group to have access to the nvidia gpu.

Is that Code “usermod -a -G vglusers username” right? after running this code. I rebooted, as i login, i got reject from the computer. The monotor is automatically disconnect. Btw, i connect the computer via remote desktop control.

Could i change vglusers to root?