Ubuntu 21.10 with GeForce 1650: nvidia-drm fails to allocate NvKmsKapiDevice, and fails to register device

user10368 · October 26, 2021, 6:58pm

This computer is a Dell XPS 15 7590 with an OLED screen running Ubuntu-Mate 21.10; it has an Intel UHD 630 as its on-chip graphics card, with a GeForce GTX 1650 Mobile / Max-Q in addition.
At present all drivers are those auto-installed, and the active driver is 470

uname -a

Linux psyche 5.13.0-20-generic #20-Ubuntu SMP Fri Oct 15 14:21:35 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

In between GRUB menu and my login screen, the following errors occur:

[drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice

[drm:nv_drm_probe_devices [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device

Oddly, booting proceeds normally from there, though a little reading through the bug report indicates a repeated

NVRM: GPU 0000:01:00.0: RmInitAdapter failed!

All functionality not dependent on the NVIDIA GPU is still present, and I have decent functionality for everything but gaming and, of course, scientific applications of the GPU.

Output of lshw -c video:

  *-display                 
       description: 3D controller
       product: TU117M [GeForce GTX 1650 Mobile / Max-Q]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list rom
       configuration: driver=nvidia latency=0
       resources: irq:16 memory:ec000000-ecffffff memory:c0000000-cfffffff memory:d0000000-d1ffffff ioport:3000(size=128) memory:ed000000-ed07ffff
  *-display
       description: VGA compatible controller
       product: CoffeeLake-H GT2 [UHD Graphics 630]
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 02
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list
       configuration: driver=i915 latency=0

output of lspci | grep NVIDIA

01:00.0 3D controller: NVIDIA Corporation TU117M [GeForce GTX 1650 Mobile / Max-Q] (rev a1)

I note that I am able to run nvidia-settings both from the command line and through GUI; but all I am able to do is choose the profile, with no other options. (Intel, NVIDIA performance mode, NVIDIA on-demand). Notably, the profile is set to NVIDIA performance mode!

nvidia-smi returns No devices were found, unsurprisingly.

It escapes me, somewhat, as to what’s going on here; the computer is aware of the GPU on a physical level, it would seem, and is running the appropriate driver… but still can’t actually find the device.

Note: this doesn’t seem like a hardware issue, since on windows (dual boot), everything runs without a hitch.
Simple fixes, like manually installing the latest driver or disabling nouveau, do nothing.

nvidia-bug-report.log.gz (185.6 KB)

generix · October 27, 2021, 9:40am

I don’t really know what’s going on since Windows is not affected. Also, no general incompatibility with your notebook model know, should work. Only thing that comes to my mind is trying to reset the mainboard which requires detaching the battery ( and press+hold power button for 20 sec. to discharge).
Sidenote: please delete xorg.conf because once the driver is working, you’ll run into a black screen due to it.

user10368 · October 27, 2021, 2:46pm

The only thing about the model I know that could be causing problems is that the screen is an OLED; even in 20.04 LTS, the screen had trouble, esp. with changing the brightness.
Two questions, so that I understand your propositions:
1: What about the error suggests a problem in the mainboard?
2: What is problematic in Xorg.conf? the entry that corresponds to the NVIDIA seems fairly non-troublesome:

`

Section "Monitor"

``

    Identifier     "Monitor0"

``

    VendorName     "Unknown"

``

    ModelName      "Unknown"

``

    Option         "DPMS"

``

EndSection

``

Section "Device"

``

    Identifier     "Device0"

``

    Driver         "nvidia"

``

    VendorName     "NVIDIA Corporation"

``

EndSection

``

Section "Screen"

``

    Identifier     "Screen0"

``

    Device         "Device0"

``

    Monitor        "Monitor0"

``

    DefaultDepth    24

``

    SubSection     "Display"

``

        Depth       24

``

    EndSubSection

`
From answers to some similar problems, I note that my config doesn’t seem to identify the PCI of the device, but this is the autogen config from running nvidia-xconfig.

Your help is very appreciated, Generix. Thanks!

generix · October 27, 2021, 2:52pm

the rminit failed message points to a low-level bus problem, from experience I know that resetting the mainboard sometimes helps with inexplicable errors.
Your display is connected to the intel igpu, the xorg.conf sets up an nvidia-only config. So if the driver worked, you would get no output on the internal screen.

Did the nvidia gpu work with 20.04? Then this might also be a kernel issue.

user10368 · October 27, 2021, 3:03pm

Thanks for the explanation!
The screen issues – brightness and tearing – kept me from ever even bothering to see if it was working in 20.04. I wouldn’t say it’s ever fully worked on any Linux distro I’ve tried; there was a brief moment when I could use it for scientific computing on Elementary os 5, which is built off of Ubuntu 18.04 LTS, but I’ve never had both cuda-enablement and reasonable graphics, no matter the kernel.

generix · October 27, 2021, 3:44pm

Digging a bit into the matter taught me that brightness control for oled displays is only working in kernels 5.12 and up.
You could try to upgrade to a 5.14 kernel, you will need 4 packages
linux-headers-XX
linux-headers-XX-generic
linux-image-unsigned-XX-generic
linux-modules-XX-generic
from https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14.15/

smougel · November 1, 2021, 10:24pm

Hello !
Same problem for me : Dell XPS 9700 with RTX 2060 Max Q / Intel iGPU
OS: Ubuntu 21.10 with kernel 5.13.0-20-generic
Nvidia drivers :
nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 495.44 Fri Oct 22 06:05:22 UTC 2021

When looking dmesg logs :

[    4.020528] nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
[    4.020533] ucsi_ccg 0-0008: i2c_transfer failed -110
[    4.020536] ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
[    4.020541] ucsi_ccg: probe of 0-0008 failed with error -110
....
[    4.291675] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1433)
[    4.291709] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[    4.291783] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[    4.291894] [drm:nv_drm_probe_devices [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device

I’m not a kernel expert but the logs show early :
[ 0.632595] pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: no compatible bridge window

This is the full log about :

$ sudo dmesg | grep 0000:01:00.0
[    0.525579] pci 0000:01:00.0: [10de:1f12] type 00 class 0x030000
[    0.525601] pci 0000:01:00.0: reg 0x10: [mem 0x72000000-0x72ffffff]
[    0.525619] pci 0000:01:00.0: reg 0x14: [mem 0x60000000-0x6fffffff 64bit pref]
[    0.525638] pci 0000:01:00.0: reg 0x1c: [mem 0x70000000-0x71ffffff 64bit pref]
[    0.525650] pci 0000:01:00.0: reg 0x24: [io  0x3000-0x307f]
[    0.525662] pci 0000:01:00.0: reg 0x30: [mem 0xfff80000-0xffffffff pref]
[    0.525746] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[    0.525803] pci 0000:01:00.0: 63.008 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x8 link at 0000:00:01.0 (capable of 126.016 Gb/s with 8.0 GT/s PCIe x16 link)
[    0.576515] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    0.576515] pci 0000:01:00.0: vgaarb: bridge control possible
[    0.632595] pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff80000-0xffffffff pref]: no compatible bridge window
[    0.633184] pci 0000:01:00.0: BAR 6: assigned [mem 0x73080000-0x730fffff pref]
[    0.634251] pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
[    0.634331] pci 0000:01:00.2: D0 power state depends on 0000:01:00.0
[    0.634646] pci 0000:01:00.3: D0 power state depends on 0000:01:00.0
[    0.636372] pci 0000:01:00.0: Adding to iommu group 1
[    3.608761] nvidia 0000:01:00.0: enabling device (0002 -> 0003)
[    3.608958] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    4.291675] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1433)
[    4.291709] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[    7.606617] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1433)
[    7.606683] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[    7.717250] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1433)
[    7.717300] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[    7.824112] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1433)
[    7.824149] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[    7.955013] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0xffff:1433)
[    7.959563] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

$ nvidia-smi 
No devices were found

If someone could help us ?

smougel · November 1, 2021, 10:30pm

Dell told me to apply this firmware :
https://www.dell.com/support/home/en-eu/drivers/driversdetails?driverid=3v8m8&driverid=3v8m8&lwp=rt

Unfortunately I can’t apply this firmware with linux/ubuntu. And I don’t know if it could solve the issue…

generix · November 2, 2021, 11:49am

The “BAR 6” issue is a red herring, it’s a very common bios bug with no bad effects.
Looks like the VBIOS update is fixing some critical bug:
https://www.dell.com/community/XPS/RTX-2060-keeps-disappearing-from-device-manager/td-p/8034428
So it’s worth a shot.

user10368 · November 2, 2021, 6:35pm

Two updates:
I was wrong about it working on windows. I can’t get any valuable info out of windows, unsurprisingly, but there’s a failed detection going on there too that is just being obscured from me.
I tried the method of taking out the battery for a mainboard reset. It seemed to do… something, since on the first startup, I saw a flurry of information. The device turned off due to low battery before reaching login. ( guess I hadn’t checked that). On plugging in and restarting…
Same error. No changes.
Happy to provide any new information or try to dig through windows for it now that I know the premise of my post was flawed.
As always, Generix, much is owed you.

generix · November 2, 2021, 7:54pm

Doesn’t sound good. You should check the bios settings for the graphics used, maybe the nvidia gpu got disabled during battery removal.
If not, please check Windows’ device manager, if the nvidia gpu driver reports code 43, it’s simply broken.

user10368 · November 2, 2021, 8:58pm

It does indeed report Code 43 :(
Lucky for me the NVIDIA warranty outlasts the Dell warranty.

smougel · November 3, 2021, 10:20pm

After many calls to Dell support team, re-installing windows, latest drivers, firmware etc… Error 43 in the device manager.
My laptop is under warranty and a tech will change the motherboard + GPU.
Dell is very silencious about this issue… but I think a lot of people will be affected.

About the firmware update :
Maybe it was too late for the firmware patch to have an effect. I don’t understand what is exactly this hardware issue but there is a lot of complaint about this error 43. Nvidia fault ? Motherboard assembler ?

Hoping this is the end of the nightmare for me and maybe this post will help other people.

marietto2008 · February 3, 2023, 6:54pm

Same error for me,too :

Topic		Replies	Views
Dual GPU problem with multiple displays in GNU/Linux Linux	12	10302	October 12, 2021
NVRM: failed to copy vbios to system memory Linux	36	10991	September 29, 2024
Ubuntu 21.10 - "Failed to grab modeset ownership" with 495.44 Linux	69	91562	February 9, 2024
Broken GPU state query failure in AMD + H100 Confidential Computing	10	1074	February 15, 2024
Ubuntu tesla P40 NVRM: GPU 0000:03:00.0: RmInitAdapter Drivers - Linux, Windows, MacOS kernel , nvbugs	4	1426	March 31, 2023
nvidia-smi "No devices were found" error CUDA Setup and Installation	23	62552	February 14, 2021
Failed to allocate NvKmsKapiDevice and Failed to register device (Rocky 9.5. and Kernel 6.12.9) Linux kernel , driver , nvidia-smi	4	316	March 6, 2025
NVRM: This PCI I/O region assigned to your NVIDIA device is invalid Linux	2	123	April 17, 2025
Failed to allocate NvKmsKapiDevice, Failed to register device(GeForce RTX 3070, Ubuntu 18.04.6) Drivers - Linux, Windows, MacOS ubuntu , nvbugs	6	1894	April 24, 2025
Xid "Internal micro-controller halt" and device not found with Mobile GTX 1050 Linux	6	1908	February 13, 2019

Ubuntu 21.10 with GeForce 1650: nvidia-drm fails to allocate NvKmsKapiDevice, and fails to register device

Related topics