510.54-RTX3090-Ubuntu 20.04 Unable to load the ‘nvidia-drm’ kernel module

I’m trying for months to install a driver for my RTX3090 card on Ubuntu 20.04 but met a confusing problem. I met this while installing, but I can use nvidia-settings successfully.

Unable to load the ‘nvidia-drm’ kernel module modprobe: ERROR: ../libkmod/libkmod-module.c:838 kmod_module_insert_module() could not find module by name='off
modprobe: ERROR: could not insert 'off': Unknown symbol in module, or unknown parameter (see dmesg)'

However, I still cannot use nvidia-smi as imagined

NVIDIA-SMI has falled because it couldn't communtcate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

I tried everythings including modprobe, nvidia-prime, acpi, pci=noacpi,…nothings works. dkms seems correctly.
When I use ppa or apt to install, lightdm service will fail.
I really hope someone to help me, I have tried everything I know
Attached is the install log
nvidia-installer.log (94.3 KB)

The message point to a blacklist file being installed.
Please run

sudo prime-select nvidia
  • run
grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*

to find a file containing

blacklist nvidia

and remove it,
then run

sudo update-initramfs -u

and reboot. If the problem persists, please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post.

3 Likes

Thank you very much for your reply! I have just done sudo prime-select nvidia in tty1 and found nvidia-smi worked correctly!
However, I made a reboot afterwards and found lightdm dead, so I have to run sudo prime-select intel and reboot to make lightdm work.
So now every time I want to use nvidia card, I have to select intel card first ,and then go to tty1, run sudo prime-select nvidia to make nvidia-smi work correctly.
In addition, I did not find a file containing blacklist nvidia, but find blacklist nvidiafb in blacklist-framebuffer.conf, I did not remove it. Should I remove blacklist nvidiafb in this file?
I run

grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*

and get

/etc/modprobe.d/blacklist.conf:#blacklist nvidiafb
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
/lib/modprobe.d/nvidia-kms.conf:#This file was generated by nvidia-prime
/lib/modprobe.d/nvidia-kms.conf:options nvidia-drm modeset=1

Finally, attached is the nvidia-bug-report.
nvidia-bug-report.log.gz (330.7 KB)

Your kernel is too old for your intel igpu, please switch to the hwe kernel:
https://wiki.ubuntu.com/Kernel/LTSEnablementStack
sudo apt install --install-recommends linux-generic-hwe-20.04

Sorry to reply late. I have update my kernel to 5.13.0-30-generic, and installed the driver again. But things seems to go worse (•︵•) . It said the installation done successfully, and the lightdm was active running after sudo service lightdm restart. But I got a black screen immediately. I tried many times such as reboot, sudo prime-select inteland reboot(and the computer will go dead), reinstall the driver, but nothing works.
Attached is the nvidia-bug-report and the install log. Looking forward to your reply
nvidia-bug-report.log.gz (278.1 KB)
nvidia-installer.log (36.1 KB)

You installed the driver from runfile using pointless options like no-opengl-files.
Please uninstall it and install the driver using Software&Updates application.

1 Like

I have tried to install with

sudo ubuntu-drivers autoinstall

and get 510 drivers again. However, if I use sudo prime-select nvidia, then I will get a black screen (just a little line like - on the left top of the screen, but lightdm seems active running)
If I use sudo prime-select intel, then I will get my desktop but nvidia-smi cannot use. However, I can use nvidia-settings to see only PRIME Profiles and select my GPU to use.
Here is my nvidia-bug-report, thank you very much!
nvidia-bug-report.log.gz (263.7 KB)

Sorry, your kernel is still too old for the 12th gen intel, please upgrade to latest:
https://launchpad.net/~damentz/+archive/ubuntu/liquorix

1 Like

I have update to linux-headers-5.16.0-11.1-liquorix-amd64 with

sudo apt install linux-headers-5.16.0-11.1-liquorix-amd64
sudo apt install linux-headers-liquorix-amd64
sudo apt install linux-image-5.16.0-11.1-liquorix-amd64
sudo apt install linux-image-liquorix-amd64

and tried to install nvidia driver from either sudo ubuntu-drivers autoinstall or run file.
However, for the first case, nvidia-smi cannot run correctly even when I use sudo prime-select nvidia and reboot. In the second case, I can use nvidia-smi only in tty1 with a black screen and can never get my desktop (•︵•) , so I use sudo Nvidiaxxx.run -uninstall in recovery mode finally.
Here is my nvidia-bug-report and the install log.
nvidia-bug-report.log.gz (99.1 KB)
nvidia-installer.log (36.6 KB)

At least the intel gpu is running correctly now. The log show that there are still parts of a 470 driver installed, so the 510 driver failed to run properly. Please post the output of
dpkg -l |grep nvidia

Oh yes (^▽^), I do found some lines containing nvidia-470 drivers. Here are the results

ii  libnvidia-cfg1-510:amd64                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-510                       510.47.03-0ubuntu0.20.04.1            all          Shared files used by the NVIDIA libraries
rc  libnvidia-compute-390:i386                 390.144-0ubuntu0.20.04.1              i386         NVIDIA libcompute package
ii  libnvidia-compute-510:amd64                510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA libcompute package
ii  libnvidia-compute-510:i386                 510.47.03-0ubuntu0.20.04.1            i386         NVIDIA libcompute package
ii  libnvidia-decode-510:amd64                 510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-510:i386                  510.47.03-0ubuntu0.20.04.1            i386         NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-510:amd64                 510.47.03-0ubuntu0.20.04.1            amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-510:i386                  510.47.03-0ubuntu0.20.04.1            i386         NVENC Video Encoding runtime library
ii  libnvidia-extra-510:amd64                  510.47.03-0ubuntu0.20.04.1            amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-510:amd64                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-510:i386                    510.47.03-0ubuntu0.20.04.1            i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-510:amd64                     510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-510:i386                      510.47.03-0ubuntu0.20.04.1            i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  linux-modules-nvidia-510-5.13.0-30-generic 5.13.0-30.33~20.04.1                  amd64        Linux kernel nvidia modules for version 5.13.0-30
ii  linux-modules-nvidia-510-generic-hwe-20.04 5.13.0-30.33~20.04.1                  amd64        Extra drivers for nvidia-510 for the generic-hwe-20.04 flavour
ii  linux-objects-nvidia-510-5.13.0-30-generic 5.13.0-30.33~20.04.1                  amd64        Linux kernel nvidia modules for version 5.13.0-30 (objects)
ii  linux-signatures-nvidia-5.13.0-30-generic  5.13.0-30.33~20.04.1                  amd64        Linux kernel signatures for nvidia modules for version 5.13.0-30-generic
ii  nvidia-compute-utils-510                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA compute utilities
ii  nvidia-driver-510                          510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-510                   510.47.03-0ubuntu0.20.04.1            amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-510                   510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA kernel source package
ii  nvidia-prime                               0.8.16~0.20.04.1                      all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                            470.57.01-0ubuntu0.20.04.3            amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-510                           510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA driver support binaries
ii  screen-resolution-extra                    0.18build1                            all          Extension for the nvidia-settings control panel
ii  xserver-xorg-video-nvidia-510              510.47.03-0ubuntu0.20.04.1            amd64        NVIDIA binary Xorg driver

It’s a bit odd, there’s one 390 package installed but that shouldn’t really matter. Most of all, this seems to be a -no-dkms driver so please remove all nvidia packages

sudo apt remove "*nvidia*"

then reinstall the driver using Software&Updates.

1 Like

It works!!! (。・∀・)ノ゙ There are really some old drivers, I use the following lines to remove them completely. (PS: If I only run the first line, then there are still some packages remaining.)

sudo apt remove *nvidia*
sudo apt-get -purge *nvidia*
sudo apt-get -purge libnvidia-compute-390
sudo dpkg --purge libnvidia-compute-390:i386
sudo apt remove sudo apt remove libnvidia-fbc1-510:i386

In addition, I noticed a little display problem near the cursor, but cannot use screenshot to record it. Is this related to the current driver? Thank you!


997c7b36d4db2325b035be90d8734f7

Ths might be either a sync or a display-panel pm issue. Please set kernel parameter
nvidia-drm.modeset=1
and reboot. If the issue is still present, please additionally set kernel parameter
i915.enable_psr=0
and reboot.

1 Like

I have added these 2 lines in /etc/default/grub

GRUB_CMDLINE_LINUX = "nouveau.modeset=0 pci=nommconf nvidia-drm.modeset=1 i915.enable_psr=0"

run sudo update-grub and reboot. However, it seems just as it used to be, nothing changed. Did I add these 2 parameters in the wrong place? Thank you!

Should be correctly inserted, but why have you set pci=nommconf? It’s not really recommended to set that.

1 Like

I used to see a lot of these lines when booting with old kernel

 AER PCIe BUS Error: serverity=corrected, type=Physical layer

Someone said this setting would help, so I added pci=nommconf to grub
But now there is no difference can be seen whether I add it or not, the same for this issue

Ok, that’s a common (bad) advice, just set
pci=noaer
nommconf also disables aer but also different things.
Please create and attach a new nvidia-bug-report.log

1 Like

Thank you for your kind advice! I changed grub settings to pci=noaer and get the bug report.
nvidia-bug-report.log.gz (305.5 KB)

There’s some firmware missing, don’t know if the 21.10 package could be used to fix that
https://packages.ubuntu.com/de/impish-updates/linux-firmware