Nvidia driver does not load on Ubuntu 18.04 with Geforce RTX 3060

Hi,

I am having problems with my driver in Ubuntu 18.04 with a GPU Geforce RTX 3060 and CPU intel core i7 on a computer with dual booting (Windows 10 and Ubuntu). I have installed the driver by all the methods I have seen online, but with all of them I get the following after introducing the command nvidia-smi:

“NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running”

Right now I am trying with the driver 460, but I have also tried with 470.

I dont have the file: “/lib/modprobe.d/blacklist-nvidia.conf”

If I introduce the command sudo prime-select nvidia, I get: “Info: the nvidia profile is already set”

I attach my bug report. If any one knows what is going on, I would really appreciate the help.
nvidia-bug-report.log.gz (1.0 MB)

Thanks.

The reason you can’t install the latest driver - or any other one - is mentioned early in the log: you have already an nvidia driver loaded, and it can’t be removed. One of the reasons could be that it’s in use, perhaps you have X running? The driver loaded is the NVIDIA UNIX x86_64 Kernel Module 460.91.03 Fri Jul 2 06:04:10 UTC 2021.

Thank you for the fast answer, ghphille.

I keep having the same issue. I have tried the following:

Trying to stop the display manager (in case it was using the nvidia drivers):
#systemctl isolate multi-user.target
#modprobe -r nvidia-drm

I also tried to kill the display manager through this:
#sudo service gdm stop
#sudo init 3

I also purged all the nvidia drivers:
#sudo apt-get remove --purge ‘^nvidia-.*’

Finally I installed the driver 460 through the GUI (software & updates / additional drivers), I rebooted and the problem persists.

I installed linux two days ago and I havent installed CUDA or anything else yet.

I dont know what else to do. Can anyone help me fix this, please?

I attach the latest bug report.
nvidia-bug-report.log.gz (1.5 MB)

Thanks.

There are probably three nvidia modules to be removed, but nvidia-installer should take care of that.

Won’t the “init 3” restart X?

Anyway, before running nvidia-installer, check using

lsmod | grep nvid

If you are able to remove all nvidia modules, the installer will ba able too.

I have removed all nvidia modules and reinstalled the driver and it still does not work. I have also tried purging everything related to nvidia and reinstalling afterwards to no avail, it just does not work.

The command sudo gpu-manager outputs the following:

last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can’t access /opt/amdgpu-pro/bin/amdgpu-pro-px
Looking for nvidia modules in /lib/modules/5.4.0-80-generic/updates/dkms
Found nvidia module: nvidia-uvm.ko
Looking for amdgpu modules in /lib/modules/5.4.0-80-generic/updates/dkms
Is nvidia loaded? yes
Was nvidia unloaded? no
Is nvidia blacklisted? no
Is intel loaded? yes
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is amdgpu versioned? no
Is amdgpu pro stack? no
Is nouveau loaded? no
Is nouveau blacklisted? yes
Is nvidia kernel module available? yes
Is amdgpu kernel module available? no
Vendor/Device Id: 8086:9bc4
BusID “PCI:0@0:2:0”
Is boot vga? yes
Vendor/Device Id: 10de:2520
BusID “PCI:1@0:0:0”
Is boot vga? no
Error: can’t access /sys/bus/pci/devices/0000:01:00.0/driver
The device is not bound to any driver.
Skipping “/dev/dri/card0”, driven by “i915”
Skipping “/dev/dri/card0”, driven by “i915”
Skipping “/dev/dri/card0”, driven by “i915”
Found “/dev/dri/card0”, driven by “i915”
output 0:
card0-eDP-1
Number of connected outputs for /dev/dri/card0: 1
Does it require offloading? yes
last cards number = 2
Has amd? no
Has intel? yes
Has nvidia? yes
How many cards? 2
Has the system changed? No
Intel IGP detected
Intel hybrid system
Creating /usr/share/X11/xorg.conf.d/11-nvidia-prime.conf
Setting power control to “on” in /sys/bus/pci/devices/0000:01:00.0/power/control

Does anyone know how to fix this?

Thanks for your time.

[ 3.694617] nvidiafb 0000:01:00.0: enabling device (0000 → 0003)

The nvidia frame buffer driver is loaded and takes ownership of the device. You need to blacklist “nvidiafb”.

Create it with the content: blacklist nvidiafb

EDIT:

Better maybe like its on my mint 19.3 (based on ubuntu 18.04):

grep -r nvidiafb /etc/modprobe.d/
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb

Thanks for your help Mart. Unfortunately, it is still not working; nvidia-smi cannot communicate with the driver.

I blacklisted nvidiafb:

#grep -r nvidiafb /etc/modprobe.d/
/etc/modprobe.d/blacklist.conf:blacklist nvidiafb

I attach a new nvidia bug report:
nvidia-bug-report.log.gz (2.9 MB)

Does someone know how to fix this?

Thanks for your time

[ 1259.563911] nvidia-nvlink: Nvlink Core is being initialized, major device number 236
[ 1259.564323] NVRM: request_mem_region failed for 0M @ 0x0. This can
NVRM: occur when a driver such as rivatv is loaded and claims
NVRM: ownership of the device’s registers.
[ 1259.564325] nvidia: probe of 0000:01:00.0 failed with error -1

Your log is full of these messages.
Something is claiming the card, so the nvidia driver cannot load.
The dmesg buffer gets full so I don’t even see what is loading before the nvidia driver and I will not look through 600000 lines of log to find that.
I’d maybe just do a sudo apt purge nvidia* libnvidia* - run the .run file installer again with the -uninstall option and then re-install the driver through ubuntu driver installation. Never use the .run file installer, unless you know what you are doing. Use the distro packaged driver!