Fedora 35: NVIDIA kernel module missing. Falling back to nouveau

Hi. I’ve read some forums on the topic and checked some common problems, but it was not enough.

It isn’t security boot.

[crocodile@localhost ~]$ /sbin/lspci | grep -e VGA
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750] (rev a2)
[crocodile@localhost ~]$ modinfo -F version nvidia
470.103.01
[crocodile@localhost ~]$ sudo akmods --force
Checking kmods exist for 5.16.19-200.fc35.x86_64           [  OK  ]
[crocodile@localhost ~]$ nvidia-settings

ERROR: Unable to find display on any available system

The only supicious lines I was able to find (Ctrl-F “fail”) in the nvidia-bug-report.log are

[    3.314562] nvidia: loading out-of-tree module taints kernel.
[    3.314571] nvidia: module license 'NVIDIA' taints kernel.
[    3.314572] Disabling lock debugging due to kernel taint
[    3.322443] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    3.331462] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 000000000c0d52f8, val ffffffffc12adf9a

But as far as I understand it is kind of OK? 3d-party kernel module taints kernel as it should, it is not an error. Or is it?

nvidia-bug-report.log.gz (55.0 KB)

You need to blacklist nouveau in order to load the nvidia driver.

Thank you for answering!

But I am in doubt. I’ve thought, that “NVIDIA kernel module missing. Falling back to nouveau” means that it’d tried NVIDIA driver first, so there is no need to forbit nouveau… And are there reasons whe this step was not mentioned in this guide Howto/NVIDIA - RPM Fusion ?

To the practical matters. How should I achieve this?

I found something about blacklisting nouveau here Fedora 36/35/34 NVIDIA [510.60.02 / 470.103.01 / 390.147 / 340.108] Drivers Install Guide – If Not True Then False (clause 2.6 Disable nouveau). But this way is kind of extreme, if something goes wrong it would be hard for me to repaire the system. Are there better way?

You are right, nouveau is already blacklisted but the nvidia modules fail to load.

[    3.331462] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 000000000c0d52f8, val ffffffffc12adf9a

Please try removing/reinstalling kernel headers.

Two thoughts came into my mind.

  • I have two graphic cards (integrated Intel and NVIDIA GeForce) and two monitors connected to them. Could this be the source of the problem? If NVIDIA drivers unable to work with Intel graphic card and OS needs them both… I’ve done some experiments, I don’t think they are of any importance, but results are below.

  • (the important one) actually, I don’t need GeForce as a working graphic card. I can connect both monitors to the Intel integrated card, they work ok this way. I need GeForce only for CUDA. Maybe there is a way to do it without meddling with nouveau and linux graphic system? It would solve all my problems.

Experiments
I reinstalled kernel-headers package and disabled Wayland (read in the Internet that it works badly with nvidia drivers), but with no success.

I am ashamed to say that in my initial configuration (previous posts) both monitors were connected to the integrated graphic card. It might explain something… I’d completely forgotten about it.

I’ve played with it for some time: left one monitor only, connected it to GeForce, changed some BIOS settings (Initiate Graphic Adapter), reinstalled nvidia drivers two or three times in the process. But the only results I’ve got were sligtly more interesting nvidia-bug-report-logs

I have a handfull of them now, with different errors and failures. There are lines like these:

Apr 18 08:47:30 localhost.localdomain /usr/libexec/gdm-x-session[2571]: (WW) xf86CloseConsole: KDSETMODE failed: Input/output error
Apr 18 08:47:30 localhost.localdomain /usr/libexec/gdm-x-session[2571]: (WW) xf86CloseConsole: VT_GETMODE failed: Input/output error

and

Apr 18 08:48:55 localhost.localdomain /usr/libexec/gdm-x-session[2576]: Kernel command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.16.19-200.fc35.x86_64 root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
Apr 18 08:48:59 localhost.localdomain nvidia-settings-470xx-user.desktop[3188]: ERROR: NVIDIA driver is not loaded
Apr 18 08:48:59 localhost.localdomain nvidia-settings-470xx-user.desktop[3188]: ERROR: Unable to load info from any available system

and

Apr 18 08:45:40 localhost.localdomain /usr/libexec/gdm-x-session[2571]: Kernel command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.16.19-200.fc35.x86_64 root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
Apr 18 08:45:43 localhost.localdomain nvidia-settings-470xx-user.desktop[3160]: ERROR: NVIDIA driver is not loaded
Apr 18 08:45:43 localhost.localdomain nvidia-settings-470xx-user.desktop[3160]: ERROR: Unable to load info from any available system

and

Apr 18 08:48:38 localhost.localdomain /usr/libexec/gdm-x-session[1557]: xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)
Apr 18 08:48:38 localhost.localdomain /usr/libexec/gdm-x-session[1557]: (II) modeset(0): using drv /dev/dri/card1
Apr 18 08:48:38 localhost.localdomain /usr/libexec/gdm-x-session[1557]: (WW) Falling back to old probe method for fbdev

and I have no idea what these all about. I can upload the last one if you think it could help.

You need to disable secure boot verification in your bios/EFI.

With f36+ kmodtool/akmods, you it will be possible to automatically sign the rebuilt kmod. So there will be no need to disable secure boot.
You can fetch the kmodtool akmods from f36 by using:
dnf update --releasever=36 kmodtool akmods

You will have to import your self generated key into your bios still…
See also /usr/share/doc/akmods/README.secureboot

@kwizart secure boot is disabled and if that was the problem, then the nvidia driver wouldn’t load at all instead of bailing out with “invalid relocation target”

There are some lines about secure boot in the nvidia-bug-report.log. It is properly disabled.

I thought about f36, but it is in beta now and I am not quite good with solving system problems… I’ll certainly install it as soon as it’ll be released. It might help, agreed, I keep some hope for it :)

right, but as generix said, it’s not secure boot related.

Can you reproduce using another (older) kernel, eventually you can try with kernel-longterm-5.15
See also kwizart/kernel-longterm-5.15 Copr

Question: does reinstalling the headers package on fedora trigger a rebuild of the nvidia modules?

I don’t know. I’ve regularly executed “sudo dnf remove \*nvidia\*” as a robust cleaning method. If I were more skillful with Linux, I’d probably use a subtler approach and could answer your question.

Thanks to all of you. I am really appreciate your intention to help me.
As far as I see they promise F36 release in a week or two, so I’ll wait and try again on the new platform.

While fedora 36 is due for 3rd of May as earliest it’s still time to report issue rather than to expect the issue will be fixed by someone else. (unfortunately this is not always the case).

Fedora 36 may have a 5.17.4 kernel, but reproducing on an another kernel would still be useful.

I’ve thought that the situation is too messy and specific to make something reproducible and reportable from it. And my hardware is quite outdated: GeForce 750 is 10 years old.