Driver version 555.58.02 failed to probe with kernel 6.10.3-200.fc40.x86_64

nvidia-bug-report.log.gz (965.4 KB)

Hello,
I have fedora 40 running kernel 6.10.3.

I have installed the driver with

sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda

nvidia-smi output

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

dmesg output

[ 5162.299763] nvidia: loading out-of-tree module taints kernel.
[ 5162.299770] nvidia: module license ‘NVIDIA’ taints kernel.
[ 5162.299771] Disabling lock debugging due to kernel taint
[ 5162.299773] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 5162.299774] nvidia: module license taints kernel.
[ 5162.440647] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5162.441497] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[ 5162.441516] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5162.441521] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5162.441538] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5162.441538] NVRM: None of the NVIDIA devices were initialized.
[ 5162.441687] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5163.546855] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5163.547732] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5163.547751] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5163.547756] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5163.547773] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5163.547774] NVRM: None of the NVIDIA devices were initialized.
[ 5163.547909] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5163.599602] intel_tcc_cooling: TCC Offset locked
[ 5537.001578] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5537.002352] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5537.002371] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5537.002376] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5537.002394] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5537.002395] NVRM: None of the NVIDIA devices were initialized.
[ 5537.002508] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5538.335990] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5538.336737] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5538.336755] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5538.336761] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5538.336777] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5538.336777] NVRM: None of the NVIDIA devices were initialized.
[ 5538.336896] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5539.692989] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5539.693769] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5539.693788] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5539.693793] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5539.693810] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5539.693811] NVRM: None of the NVIDIA devices were initialized.
[ 5539.693955] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5764.850422] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5764.851204] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5764.851223] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5764.851228] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5764.851248] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5764.851248] NVRM: None of the NVIDIA devices were initialized.
[ 5764.851375] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5766.521425] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5766.522421] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5766.522440] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5766.522446] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5766.522473] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5766.522474] NVRM: None of the NVIDIA devices were initialized.
[ 5766.522689] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5768.147306] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5768.148123] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5768.148149] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5768.148154] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5768.148172] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5768.148173] NVRM: None of the NVIDIA devices were initialized.
[ 5768.148299] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5769.785464] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5769.786304] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5769.786323] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5769.786329] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5769.786350] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5769.786351] NVRM: None of the NVIDIA devices were initialized.
[ 5769.786547] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[ 5771.381261] nvidia-nvlink: Nvlink Core is being initialized, major device number 509

[ 5771.382085] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[ 5771.382104] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:249d) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 5771.382109] nvidia 0000:01:00.0: probe with driver nvidia failed with error -1
[ 5771.382129] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 5771.382130] NVRM: None of the NVIDIA devices were initialized.
[ 5771.382273] nvidia-nvlink: Unregistered Nvlink Core, major device number 509

secure boot disabled

I remember it’s fine with kernel versions 6.8.x.
Can you help me debug and figure out what is wrong with the kernel 6.10.3
Thanks!

Same issue here.
Also reported it: 555.58.02 - Not working after kernel version 6.9.7 (Fedora 40)

Did you happen to find a solution?

I haven’t seen anything come back from support or anyone on the forum.

Please upgrade the driver, 555.58 have a memory leak problem. The fixed version is 555.107(But I’m not so sure, as it’s in another post). And, if your gpu is supported by nvidia-open, then move to it.
If you don’t want to have multiple tests, you can try nvidia-open 560 beta, it works well with my rtx 4060 laptop gpu on archlinux

I haven’t found a solution yet. My laptop is also a Razer Blade 15.

Try booting with pcie_port_pm=off

Thanks! it works

Great!
It seems to be a kernel regression related to pcie power management which only occurs on our razer laptops.

I found the same thing happening when using the nouveau driver.
It tries to switch from d0 to d3 and then it fails to read the chip identifier.

Just for anyone who needs: I’m using Linux Mint 21.3, with Kernel 6.8 and did just that pcie_port_pm=off and worked for me. Same problem.