Nvidia Driver 510.68.02 fails boot with kernel 5.17.13

This on Fedora 36 with an XFCE4 desktop. All workd with kernel 1.17.12.

@FredK
Please share nvidia bug report from repro state for analysis.

I’m not clear on what “Please share nvidia bug report from repro state for analysis.” this is requesting if anything. Perhaps you could clarify? Thanks,
Fred

Dear FredK,

What does it mean in detail that your system fails to boot?
Does just the X-Server not come up and you are stuck on the commandline?
Which OS are you useing, etc…
some more information is required to help.

It is not a general issue, because I am currently using 510.68.02 with kernel 5.17.13 on Fedora 36

It prints out a lot of stuff in console 1, that ends with

NVIDIA kernel module missing. Falling back to nouveau

And from there I can get to console 2 and reboot. No input other than F2 works, except with F2 replaced by F3, F4, etc.
I’ve tried startx from console 2 and it does not work. If I boot from kernel 5.17.12 instead of 5.17.13 all works fine.

I have files containing the files from running dmesg, from the end of /var/log/messages, and from .xsession-errors. Sizes are as follows:

89663 Jun 15 10:20 tmp.dmesg
4473 Jun 15 10:27 tmp.var.log.messages
19882 Jun 15 10:33 tmp.xsession-errors

Let me know if you want these files.

My original post indicated that I’m running Fedora 36, on an x86.64 system with xfce4 as the desktop.

If this is not enough, let me know what else is needed.
Thanks,
Fred

which driver package do you use?
RPMfusion or the binary driver directly from nvidia?

It looks like as dkms is not active, which is required to get the kernel module recompiled every time a new kernel is installed.

could you try the following:
boot into kernel 5.17.13 and switch to terminal console 2
there run:
sudo nvidia-bug-report.sh

reboot into kernel 5.17.12 und upload the generated bug report. you should find it in your home directory

It appears that kmod-nvidia-5.17.13-3100.fc36.x86_64-nvidia-kernel modules for 5.17.13-300.fc36.x86_64 is using the repository @system which reference http://www.nvidia.com/less

Do you have a preferred way for me to post nvidia-bug-report…log.gz, or should I email it to linux-bugs@nvidia.com?
Thanks for pursuing this,
Fred

What do you mean by this?
How have you installed the driver? Could you please explain in detail, thanks.

You can upload it directly here.
In the editor window when writing your post, you have several symbols

The upload click on the upload symbol, the 7th symbol from the left.
“Quote”, “Bold”, “Italic”, “hyperlink”, “Blockquote”, “Preformatted Text”, “UPLOAD”,…

nvidia-bug-report.log.gz (89.7 KB)

I use the fedora package manager. I count on them for the proper driver. The driver I mentioned is listed as installed.

Fedora itself does not provide the proprietary Nvidia driver. So when you say you used the fedora package manager, this is only half of the information.
You have added a repository, the question is: which one? RPMfusion or negativo17?

Sorry, I don’t understand how all this works. In the list of repositories, I see rpmfusion-free, rpmfusion-free-updates, rpmfusion-nonfree, rpmfusion-nonfree-updates. I see no mention of negativo17. I think this should answer your question – I hope!
Thanks.

Thanks, yes, this gives the answer, you are using rpmfusion.

I suggest the following:
boot into kernel 5.17.13 and run

sudo akmods --force
sudo dracut /boot/initramfs-$(uname -r).img $(uname -r) --force

reboot again into kernel 5.17.13 and hope that it works again.

If it does not, I suggest to uninstall the nvidia driver by running:
sudo dnf remove xorg-x11-drv-nvidia\*

reboot again and install the driver by running

sudo dnf update
sudo dnf install akmod-nvidia

Do NOT reboot instantly. It takes some time untile the kernel modules are built
Wait until…

modinfo -F version nvidia

…returns 510.68.02. If so you can reboot

O.k. the akmods --force and the dracut … failed to work. Then

sudo dnf remove xorg-x11-drv-nvidia*

Removed 45 packages and I was left wondering how big a mess this was going to leave me with.

I then rebooted and it just worked!

Then tried to run nvidia-settings and had to install it, included 38 packages. Rebooted and nvidia-settings works but it will not let me change the frequency from auto to 60 Hz. Something is used to allow.
(No big deal.)
Note that I did not need to install akmod-nvidia again. I find this a bit confusing, but am happy to have it working.

If you have some explanation for what actually was causing the problem and why it got fixed the way it was, I’l love to know.
Thanks.

You should not wonder, when you uninstall the nvidia driver, that certain things do not work as expected.

It looks like you are not quite familiar with the linux console. Thus, I suggest to follow the instructions strictly.
First, read until the end before performing commands you may not completely understand.

Explenation:
when you did
sudo dnf remove xorg-x11-drv-nvidia\*
you uninstalled the proprietary nvidia driver. This includes the removal of blacklisting the nouveau open source driver, which is included in the kernel. So when you did a reboot the nouveau open source driver was loaded, since it was the only driver available for the graphics chip.

you said you wanted to start nvidia-settings. It is clear that by removing the proprietary nvidia driver, the nvidia-settings tool was also removed, since it is part of the proprietary nvidia driver.
When you tried to just install nvidia-settings, some dependent packages were also installed. But since you did not install as suggested by running sudo dnf install akmod-nvidia, you are missing certain parts of the driver. The next time a new kernel version/subversion will be released for Fedora 36, you are running into the exact same problem. DKMS is required to automatically generate new kernel modules of the nvidia driver as soon as a new kernel version gets installed, for example by updates (which is quite recently the case).

The only correct way to install the proprietary nvidia driver via rpmfusion is by running

sudo dnf install akmod-nvidia

so please, once again (now, you do not need to boot into the virtual terminal, you can do it from within XFCE:

sudo dnf remove xorg-x11-drv-nvidia\*
reboot
sudo dnf install akmod-nvidia

Do NOT reboot instantly. It takes some time untile the kernel modules and dkms related stuff are built.
and wait until…

modinfo -F version nvidia

…returns 510.68.02. If so you can reboot

EDIT:
Maybe you are lucky, because by installing nvidia-settings also akmod-nvidia and all its dependencies get installed.
But I do not guaranty that it works as expected.

I went through this process again, and frankly I think this gave no change. I.e. I did it just fine the last time. nvidia-settings still will not save configuration change from auto to 60. But xorg.conf has “metamodes” “4096x2160_60 +0+0” which at least suggests to me that it is set at 60 Hz. All is fine with me, until perhaps the next big kernel upgrade. Many thanks for the help.
Fred