Fedora 39: With Kernel versions after 6.6.2-201 the nvidia driver doesn't work properly. [Failed to allocate NvKmsKapiDevice]

Hello,
Like I said in the title, with Kernel versions after 6.6.2-201 the nvidia driver doesn’t work properly and I get this errors in logs:

[drm:nv_drm_register_drm_device [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device
[drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
fedora kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed!
❯ nvidia-smi
Mon May  6 14:05:26 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78                 Driver Version: 550.78         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3070 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   37C    P8             16W /  115W |       1MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

❯ lspci | grep NVIDIA
0000:01:00.0 VGA compatible controller: NVIDIA Corporation GA104M [Geforce RTX 3070 Ti Laptop GPU] (rev a1)
0000:01:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)
❯ sudo dmesg | grep -i nvidia\\\|nvrm
[sudo] password for vnm_rzv: 
[    0.000000] Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-6.8.8-200.fc39.x86_64 root=UUID=e281fdfb-17d3-4104-904b-8d787dacd632 ro rootflags=subvol=root rd.driver.blacklist=nouveau modprobe.blacklist=nouveau initcall_blacklist=simpledrm_platform_driver_init rhgb quiet initcall_blacklist=simpledrm_platform_driver_init nvidia-drm.modeset=1 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau
[    0.043324] Kernel command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-6.8.8-200.fc39.x86_64 root=UUID=e281fdfb-17d3-4104-904b-8d787dacd632 ro rootflags=subvol=root rd.driver.blacklist=nouveau modprobe.blacklist=nouveau initcall_blacklist=simpledrm_platform_driver_init rhgb quiet initcall_blacklist=simpledrm_platform_driver_init nvidia-drm.modeset=1 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau
[    7.042330] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input23
[    7.042404] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input24
[    7.042452] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input25
[    7.042504] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input26
[    7.652688] nvidia: loading out-of-tree module taints kernel.
[    7.652692] nvidia: module license 'NVIDIA' taints kernel.
[    7.652694] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    7.652694] nvidia: module license taints kernel.
[    7.775535] nvidia-nvlink: Nvlink Core is being initialized, major device number 510
[    7.776153] nvidia 0000:01:00.0: enabling device (0000 -> 0003)
[    7.776258] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[    7.823534] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  550.78  Sun Apr 14 06:35:45 UTC 2024
[    7.875522] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[    7.943308] nvidia-uvm: Loaded the UVM driver, major device number 508.
[    7.978179] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  550.78  Sun Apr 14 06:23:31 UTC 2024
[    7.982712] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[    8.415869] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0x72:1556)
[    8.415897] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[    8.416017] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[    8.416110] [drm:nv_drm_register_drm_device [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device
❯ uname -a
Linux fedora 6.8.8-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Apr 27 17:42:13 UTC 2024 x86_64 GNU/Linux

Thank you!

Is that a Lenovo laptop? I have the same graphic card and I get the same error. Only kernel < 6.5 will show no issue. You’re not the only one and I wonder which could be the culprit. Obviously something kernel related…

Yes, I have Lenovo Legion 5 Pro 16IAH7H, for me the latest that work is 6.6.2-201.

Exactly the Same laptop here. Now it happens to work even with the latest kernel (tasted right yesterday) but I tried to benchmark Cyberpunk 2077 and the fps are 50% lower than 6.4.x kernel

If you use Nvidia card exclusively (no Optimus) benchmarks are quite the same as older kernels

I am using Optimus, but for me nvidia driver doesn’t work properly with the later kernels. What kernel do you use right now that works?

In my case something strange happened when I installed the latest kernel, after reboot nvidia-smi gave me No running processes found, but after that, I deleted the kmod for that kernel and used akmods --force to rebuild the modules and it worked after another reboot, but after I shut down the laptop and started again it didn’t work anymore, and after that I posted here the problem, after so many kernels that I tested and don’t work. I tried deleting again the kmod and use akmods --force once more, doesn’t work.

The only kernel that works decently with fairly good framerates is the 6.4.14 (with Optimus it works fine) and that’s the one I’m using right now

1 Like

I tried now the latest 6.8.9-200.fc39.x86_64 kernel and it seems to work properly, at least for steam games, I will need to test the HDMI monitor, too, because that didn’t work either when nvidia driver stopped working. I will need to keep using this kernel to be sure, at first it was ok, after a shutdown nvidia driver stopped working and now after another shutdown it’s back on track. For now this kernel worked for me after 6.6.2-201, all the others didn’t work at all. Finger crossed. :))

Yes it works but the frame rate is not on par with 6.4 (at least for me) Did you run some benchmark to see if performances are ok?

No, I didn’t benchmark it, for me neither kernel worked at all beside 6.6.2-201, I mean the nvidia driver didn’t worked.

Ok but you said you tested some game (with steam) and what are your impressions?

I don’t sense any performance difference right now, I noticed a drop in frame rate for dota2 a few days ago that persisted to now(something like from ~110fps to ~90fps ), but it was on kernel 6.6.2-201 and on this new one it’s the same, I don’t know why it happened and if it’s not only in my mind. I’m using Driver Version: 550.78.