Driver version 555.58.02 failed to probe with kernel 6.10.3-200.fc40.x86_64

nvidia-bug-report.log.gz (965.4 KB)

Hello,
I have fedora 40 running kernel 6.10.3.

I have installed the driver with

sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda

nvidia-smi output

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

dmesg output

[ 5162.299763] nvidia: loading out-of-tree module taints kernel.
[ 5162.299770] nvidia: module license ‘NVIDIA’ taints kernel.
[ 5162.299771] Disabling lock debugging due to kernel taint
[ 5162.299773] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 5162.299774] nvidia: module license taints kernel.
[ 5162.440647] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5162.441497] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[ 5162.441516] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5162.441521] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5162.441538] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5162.441538] NVRM: None of the NVIDIA devices were initialized.
[ 5162.441687] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5163.546855] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5163.547732] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5163.547751] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5163.547756] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5163.547773] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5163.547774] NVRM: None of the NVIDIA devices were initialized.
[ 5163.547909] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5163.599602] intel_tcc_cooling: TCC Offset locked
[ 5537.001578] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5537.002352] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5537.002371] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5537.002376] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5537.002394] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5537.002395] NVRM: None of the NVIDIA devices were initialized.
[ 5537.002508] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5538.335990] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5538.336737] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5538.336755] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5538.336761] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5538.336777] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5538.336777] NVRM: None of the NVIDIA devices were initialized.
[ 5538.336896] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5539.692989] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5539.693769] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5539.693788] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5539.693793] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5539.693810] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5539.693811] NVRM: None of the NVIDIA devices were initialized.
[ 5539.693955] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5764.850422] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5764.851204] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5764.851223] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5764.851228] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5764.851248] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5764.851248] NVRM: None of the NVIDIA devices were initialized.
[ 5764.851375] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5766.521425] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5766.522421] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5766.522440] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5766.522446] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5766.522473] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5766.522474] NVRM: None of the NVIDIA devices were initialized.
[ 5766.522689] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5768.147306] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5768.148123] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5768.148149] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5768.148154] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5768.148172] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5768.148173] NVRM: None of the NVIDIA devices were initialized.
[ 5768.148299] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5769.785464] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5769.786304] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5769.786323] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5769.786329] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5769.786350] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5769.786351] NVRM: None of the NVIDIA devices were initialized.
[ 5769.786547] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5771.381261] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5771.382085] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5771.382104] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5771.382109] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5771.382129] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5771.382130] NVRM: None of the NVIDIA devices were initialized.
[ 5771.382273] nvidia-nvlink: Unregistered Nvlink Core, major device number 509

secure boot disabled

I remember it’s fine with kernel versions 6.8.x.
Can you help me debug and figure out what is wrong with the kernel 6.10.3
Thanks!

Same issue here.
Also reported it: 555.58.02 - Not working after kernel version 6.9.7 (Fedora 40)

Did you happen to find a solution?

I haven’t seen anything come back from support or anyone on the forum.

Please upgrade the driver, 555.58 have a memory leak problem. The fixed version is 555.107(But I’m not so sure, as it’s in another post). And, if your gpu is supported by nvidia-open, then move to it.
If you don’t want to have multiple tests, you can try nvidia-open 560 beta, it works well with my rtx 4060 laptop gpu on archlinux

I haven’t found a solution yet. My laptop is also a Razer Blade 15.

Try booting with pcie_port_pm=off

Thanks! it works

Great!
It seems to be a kernel regression related to pcie power management which only occurs on our razer laptops.

I found the same thing happening when using the nouveau driver.
It tries to switch from d0 to d3 and then it fails to read the chip identifier.