NVIDIA RTX 5090 Not Detected by nvidia-smi on Ubuntu Server 24.04

NVIDIA RTX 5090 Not Detected by nvidia-smi on Ubuntu Server 24.04

I recently purchased a ZOTAC GAMING GeForce RTX 5090 AMP Extreme INFINITY GPU, but it is not recognized by the nvidia-smi command.

System Configuration

  • Motherboard: ASUS Pro WS WRX90E-SAGE SE
  • BIOS Version: 0803 (Released 2024/10/15)
  • CPU: Ryzen Threadripper PRO 7965WX
  • GPU: ZOTAC GAMING GeForce RTX 5090 AMP Extreme INFINITY
  • Operating System: Ubuntu Server 24.04.2 LTS
  • Kernel: 6.8.0-55-generic
  • Driver: Linux 64-bit 570.124.04

BIOS Settings

  • Secure Boot: Disabled
  • CSM: Enabled
  • Resize BAR Support: Disabled (Enabling requires CSM Disabled)
  • Above 4G Decoding: Option not available in BIOS settings
$ lspci -nn | grep -i nvidia
c1:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1)
c1:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)
$ sudo nvidia-smi
No devices were found

Terminal Logs

$ lspci | grep -i nvidia
c1:00.0 VGA compatible controller: NVIDIA Corporation GB202 [GeForce RTX 5090] (rev a1)
c1:00.1 Audio device: NVIDIA Corporation Device 22e8 (rev a1)
$ lsmod | grep -E 'nouveau|nvidia'
nvidia_uvm           2162688  0
nvidia_drm            131072  0
nvidia_modeset       1724416  1 nvidia_drm
nvidia              11636736  2 nvidia_uvm,nvidia_modeset
video                  77824  2 asus_wmi,nvidia_modeset
ecc                    45056  1 nvidia
$ sudo modprobe nvidia
$ sudo nvidia-smi
No devices were found
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
$ tail -3 /etc/modprobe.d/blacklist.conf
blacklist nvidiafb
blacklist nouveau

$ sudo dmesg | grep -i nvidia
[    4.847877] nvidia: loading out-of-tree module taints kernel.
[    4.847887] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    4.904072] nvidia-nvlink: Nvlink Core is being initialized, major device number 236
[    4.905849] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
[    4.905853] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
[    4.905855] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
[    4.905856] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
[    4.905863] nvidia 0000:c1:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=io+mem
[    4.920567] NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64  570.124.04  Release Build  (dvs-builder@U22-I3-AF04-14-5)  Tue Feb 25 03:49:44 UTC 2025
[    4.950401] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64  570.124.04  Release Build  (dvs-builder@U22-I3-AF04-14-5)  Tue Feb 25 03:39:40 UTC 2025
[    4.951730] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input6
[    4.951842] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input7
[    4.951942] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input8
[    4.952057] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input9
[    4.953871] [drm] [nvidia-drm] [GPU ID 0x0000c100] Loading driver
[    4.953874] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:c1:00.0 on minor 1

$ lspci -nn | grep -i nvidia
c1:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1)
c1:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)

Can I ask how you got this driver, and what commands did you use to install it?
BTW Its probably not a solution but as far as I can see the current version of the nvidia driver for cuda on ubuntu 24.04 is 570.124.06.

Have you tried enabling Resize BAR?

Thank you for providing the new driver version 570.124.06!

I installed this driver, but unfortunately, it did not resolve the issue.

Steps Performed

sudo apt remove --purge nvidia*
sudo apt autoremove
sudo dpkg -i nvidia-driver-local-repo-ubuntu2404-570.124.06_1.0-1_amd64.deb
sudo cp /var/nvidia-driver-local-repo-ubuntu2404-570.124.06/nvidia-driver-local-D67F55A1-keyring.gpg /usr/share/keyrings/
sudo apt update
sudo apt install nvidia-driver-570
sudo update-initramfs -u
sudo reboot
$ sudo nvidia-smi
No devices were found
$ sudo dpkg -l | grep nvidia-driver
ii  nvidia-driver-570                              570.124.06-0ubuntu1                     amd64        NVIDIA driver metapackage
ii  nvidia-driver-local-repo-ubuntu2404-570.124.06 1.0-1                                   amd64        nvidia-driver-local repository configuration files
$ lspci -nn | grep -i nvidia
c1:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1)
c1:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)
$ lsmod | grep -E 'nouveau|nvidia'
nvidia_uvm           2088960  0
nvidia_drm            131072  0
nvidia_modeset       1548288  1 nvidia_drm
nvidia              89858048  2 nvidia_uvm,nvidia_modeset
video                  77824  2 asus_wmi,nvidia_modeset

Thank you! But I also tried enabling Resize BAR and disabling CSM, but the issue persisted.

As someone who doesn’t use ubuntu I am probably not much help but the only other thing i have spotted is that the instructions for the local repo install (which is what you appear to me to have done) contain a 3rd step:

Add pin file to prioritize CUDA repository:

wget Index of /compute/cuda/repos//cuda-.pin
sudo mv cuda-.pin /etc/apt/preferences.d/cuda-repository-pin-600

heaven only knows what that is about and sorry if it doesn’t help (even sorrier if it makes it worse!)

Oh for heavens sake! read this!
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu

Sorry this is so strung out but his might help too. There are pre-installation instructions further up you might want to check.

Thank you! I’ll give it a try!

Any luck? After booting are the drivers actually loaded in the kernel? If that isn’t happening then nvidia-smi won’t show the device.

egrep nvidia /proc/modules
nvidia_uvm 2088960 0 - Live 0x0000000000000000 (POE)
nvidia 89821184 6 nvidia_uvm, Live 0x0000000000000000 (POE)

thanks!
It’s still not working correctly!

I checked the logs and found that PCI BAR assignment had failed. I resolved this error by adjusting the following BIOS settings:

  • Memory Interleaving set to disabled
  • NUMA nodes changed from Auto to NPS1
[    6.165712] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR1 is 0M @ 0x0 (PCI:0000:c1:00.0)
[    6.167463] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR2 is 0M @ 0x0 (PCI:0000:c1:00.0)
[    6.167465] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR3 is 0M @ 0x0 (PCI:0000:c1:00.0)
[    6.167466] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR4 is 0M @ 0x0 (PCI:0000:c1:00.0)

Next, another error occurred, so I disabled IOMMU and adjusted the grub settings:

[    6.564824] nvidia 0000:c1:00.0: Using 47-bit DMA addresses
[    6.564988] NVRM: osIovaMap: failed to map allocation (status = 0x59)
[    6.565405] NVRM: osInitNvMapping: *** Cannot attach gpu
[    6.565407] NVRM: RmInitAdapter: osInitNvMapping failed, bailing out of RmInitAdapter
[    6.565415] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x22:0x59:742)

Updated grub configuration:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=realloc pci=nocrs amd_iommu=off iommu=off"

However, now I’m encountering the following error:

NVRM: _kgspBootGspRm: unexpected WPR2 already up, cannot proceed with booting GSP
NVRM: _kgspBootGspRm: (the GPU is likely in a bad state and may need to be reset)
NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x62:0x40:1860)

I ran the following command!

$ egrep nvidia /proc/modules
nvidia_drm 131072 0 - Live 0x0000000000000000 (OE)
nvidia_modeset 1724416 1 nvidia_drm, Live 0x0000000000000000 (OE)
nvidia 11636736 1 nvidia_modeset, Live 0x0000000000000000 (OE)
video 77824 2 nvidia_modeset,asus_wmi, Live 0x0000000000000000
ecc 45056 1 nvidia, Live 0x0000000000000000

Does the command: which nvidia-smi
show: /usr/bin/nvidia-smi
What is the “ls -l” date on that file?
nvidia-smi --version # This will show if the smi you are running matches your driver version.
NVIDIA-SMI version : 570.86.15
NVML version : 570.86
DRIVER version : 570.86.15
CUDA Version : 12.8

Mine is Jan 23rd. NOTE: I only have a 4090 and have been unable to get a 5090 to figure this out for myself. The other things I’d check is “sudo dmesg” and see if there are any errors related to the device. Also look through /var/log/syslog. Lastly do:
ls -l /dev/nvidia* # Check if the devices were created and are all read/write
crw-rw-rw- 1 root root 195, 0 Mar 18 09:55 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Mar 18 09:55 /dev/nvidiactl
crw-rw-rw- 1 root root 507, 0 Mar 18 09:56 /dev/nvidia-uvm
crw-rw-rw- 1 root root 507, 1 Mar 18 09:56 /dev/nvidia-uvm-tools

/dev/nvidia-caps:
total 0
cr-------- 1 root root 238, 1 Mar 18 09:56 nvidia-cap1
cr–r–r-- 1 root root 238, 2 Mar 18 09:56 nvidia-cap2

I ran the following command!

$ which nvidia-smi
/usr/bin/nvidia-smi
$ nvidia-smi --version
NVIDIA-SMI version  : 570.124.06
NVML version        : 570.124
DRIVER version      : 570.124.06
CUDA Version        : 12.8
$ ls -l /usr/bin/nvidia-smi
-rwxr-xr-x 1 root root 1137440 Feb 26 01:42 /usr/bin/nvidia-smi
$ /dev/nvidia-caps
bash: /dev/nvidia-caps: Is a directory
$ ls -l /dev/nvidia-caps
total 0
cr-------- 1 root root 237, 1 Mar 19 03:37 nvidia-cap1
cr--r--r-- 1 root root 237, 2 Mar 19 03:37 nvidia-cap2
$ sudo ls -l /dev/nvidia
ls: cannot access '/dev/nvidia': No such file or directory

and dmesg.

$ sudo dmesg | grep -i -C 5 'c1:00.0'
[    1.791472] pci 0000:c0:07.0: [1022:149f] type 00 class 0x060000 conventional PCI endpoint
[    1.791525] pci 0000:c0:07.1: [1022:14a7] type 01 class 0x060400 PCIe Root Port
[    1.791537] pci 0000:c0:07.1: PCI bridge to [bus c2]
[    1.791549] pci 0000:c0:07.1: enabling Extended Tags
[    1.791587] pci 0000:c0:07.1: PME# supported from D0 D3hot D3cold
[    1.791738] pci 0000:c1:00.0: [10de:2b85] type 00 class 0x030000 PCIe Legacy Endpoint
[    1.791747] pci 0000:c1:00.0: BAR 0 [mem 0xb4000000-0xb7ffffff]
[    1.791756] pci 0000:c1:00.0: BAR 1 [mem 0x10800000000-0x10fffffffff 64bit pref]
[    1.791764] pci 0000:c1:00.0: BAR 3 [mem 0x11000000000-0x11001ffffff 64bit pref]
[    1.791770] pci 0000:c1:00.0: BAR 5 [io  0x3000-0x307f]
[    1.791776] pci 0000:c1:00.0: ROM [mem 0xb8000000-0xb807ffff pref]
[    1.791799] pci 0000:c1:00.0: Enabling HDA controller
[    1.791849] pci 0000:c1:00.0: PME# supported from D0 D3hot
[    1.791878] pci 0000:c1:00.0: VF BAR 0 [mem 0x00000000-0x0003ffff 64bit pref]
[    1.791879] pci 0000:c1:00.0: VF BAR 0 [mem 0x00000000-0x0003ffff 64bit pref]: contains BAR 0 for 1 VFs
[    1.791887] pci 0000:c1:00.0: VF BAR 2 [mem 0x00000000-0x0fffffff 64bit pref]
[    1.791888] pci 0000:c1:00.0: VF BAR 2 [mem 0x00000000-0x0fffffff 64bit pref]: contains BAR 2 for 1 VFs
[    1.791896] pci 0000:c1:00.0: VF BAR 4 [mem 0x00000000-0x01ffffff 64bit pref]
[    1.791897] pci 0000:c1:00.0: VF BAR 4 [mem 0x00000000-0x01ffffff 64bit pref]: contains BAR 4 for 1 VFs
[    1.791969] pci 0000:c1:00.0: 252.048 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x16 link at 0000:c0:01.1 (capable of 504.112 Gb/s with 32.0 GT/s PCIe x16 link)
[    1.792034] pci 0000:c1:00.1: [10de:22e8] type 00 class 0x040300 PCIe Endpoint
[    1.792042] pci 0000:c1:00.1: BAR 0 [mem 0xb8080000-0xb8083fff]
[    1.792180] pci 0000:c0:01.1: PCI bridge to [bus c1]
[    1.792257] pci 0000:c2:00.0: [1022:14ac] type 00 class 0x130000 PCIe Endpoint
[    1.792279] pci 0000:c2:00.0: enabling Extended Tags
--
[    1.941088] e820: reserve RAM buffer [mem 0x5d92e000-0x5fffffff]
[    1.941089] e820: reserve RAM buffer [mem 0x61722000-0x63ffffff]
[    1.941090] e820: reserve RAM buffer [mem 0x6fff3000-0x6fffffff]
[    1.941091] e820: reserve RAM buffer [mem 0x6fffe000-0x6fffffff]
[    1.941092] e820: reserve RAM buffer [mem 0x807dbc0000-0x807fffffff]
[    1.941145] pci 0000:c1:00.0: vgaarb: setting as boot VGA device
[    1.941145] pci 0000:c1:00.0: vgaarb: bridge control possible
[    1.941145] pci 0000:c1:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    1.941145] vgaarb: loaded
[    1.941145] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[    1.941145] hpet0: 3 comparators, 32-bit 14.318180 MHz counter
[    1.943427] clocksource: Switched to clocksource tsc-early
[    1.944093] VFS: Disk quotas dquot_6.6.0
--
[    1.970970] pci_bus 0000:e0: resource 4 [io  0x0000-0xffff]
[    1.970971] pci_bus 0000:e0: resource 5 [mem 0x00000000-0xfffffffffffff]
[    1.970973] pci_bus 0000:e1: resource 1 [mem 0xf0100000-0xf01fffff]
[    1.970974] pci_bus 0000:e4: resource 1 [mem 0xf0000000-0xf00fffff]
[    1.971028] pci_bus 0000:c0: max bus depth: 1 pci_try_num: 2
[    1.971032] pci 0000:c1:00.0: VF BAR 2 [mem size 0x10000000 64bit pref]: can't assign; no space
[    1.971034] pci 0000:c1:00.0: VF BAR 2 [mem size 0x10000000 64bit pref]: failed to assign
[    1.971036] pci 0000:c1:00.0: VF BAR 4 [mem size 0x02000000 64bit pref]: can't assign; no space
[    1.971038] pci 0000:c1:00.0: VF BAR 4 [mem size 0x02000000 64bit pref]: failed to assign
[    1.971040] pci 0000:c1:00.0: VF BAR 0 [mem 0xb80c0000-0xb80fffff 64bit pref]: assigned
[    1.971043] pci 0000:c0:01.1: PCI bridge to [bus c1]
[    1.971045] pci 0000:c0:01.1:   bridge window [io  0x3000-0x3fff]
[    1.971048] pci 0000:c0:01.1:   bridge window [mem 0xb4000000-0xb80fffff]
[    1.971050] pci 0000:c0:01.1:   bridge window [mem 0x10800000000-0x11001ffffff 64bit pref]
[    1.971065] pci 0000:c0:07.1: PCI bridge to [bus c2]
--
[    1.971078] release child resource [mem 0x10800000000-0x10fffffffff 64bit pref]
[    1.971079] release child resource [mem 0x11000000000-0x11001ffffff 64bit pref]
[    1.971081] pci 0000:c0:01.1: resource 15 [mem 0x10800000000-0x11001ffffff 64bit pref] released
[    1.971082] pci 0000:c0:01.1: PCI bridge to [bus c1]
[    1.971097] pci 0000:c0:01.1: bridge window [mem 0x8400000000-0x8fffffffff 64bit pref]: assigned
[    1.971099] pci 0000:c1:00.0: BAR 1 [mem 0x8800000000-0x8fffffffff 64bit pref]: assigned
[    1.971106] pci 0000:c1:00.0: VF BAR 2 [mem 0x8400000000-0x840fffffff 64bit pref]: assigned
[    1.971109] pci 0000:c1:00.0: BAR 3 [mem 0x8410000000-0x8411ffffff 64bit pref]: assigned
[    1.971115] pci 0000:c1:00.0: VF BAR 4 [mem 0x8412000000-0x8413ffffff 64bit pref]: assigned
[    1.971118] pci 0000:c0:01.1: PCI bridge to [bus c1]
[    1.971120] pci 0000:c0:01.1:   bridge window [io  0x3000-0x3fff]
[    1.971122] pci 0000:c0:01.1:   bridge window [mem 0xb4000000-0xb80fffff]
[    1.971125] pci 0000:c0:01.1:   bridge window [mem 0x8400000000-0x8fffffffff 64bit pref]
[    1.971137] pci 0000:c0:07.1: PCI bridge to [bus c2]
--
[    1.971588] pci_bus 0000:8a: resource 1 [mem 0xb0900000-0xb0afffff]
[    1.971589] pci_bus 0000:8b: resource 1 [mem 0xb0900000-0xb0afffff]
[    1.971591] pci_bus 0000:8c: resource 1 [mem 0xb0a00000-0xb0afffff]
[    1.971592] pci_bus 0000:8d: resource 1 [mem 0xb0900000-0xb09fffff]
[    1.971825] pci 0000:c1:00.1: extending delay after power-on from D3hot to 20 msec
[    1.971848] pci 0000:c1:00.1: D0 power state depends on 0000:c1:00.0
[    1.972243] ACPI Warning: \_SB.PCI1.GPPC.UP00.DP60._PRT: Return Package has no elements (empty) (20230628/nsprepkg-94)
[    1.972290] ACPI Warning: \_SB.PCI1.GPPC.UP00.DP60._PRT: Return Package has no elements (empty) (20230628/nsprepkg-94)
[    1.972318] PCI: CLS 64 bytes, default 64
[    1.972370] Trying to unpack rootfs image as initramfs...
[    1.972398] LVT offset 0 assigned for vector 0x400
--
[    6.306634] snd_hda_intel 0000:02:00.7: enabling device (0000 -> 0002)
[    6.315963] RAPL PMU: API unit is 2^-32 Joules, 1 fixed counters, 163840 ms ovfl timer
[    6.315967] RAPL PMU: hw unit of domain package 2^-16 Joules
[    6.348518] nvidia-nvlink: Nvlink Core is being initialized, major device number 236

[    6.350832] nvidia 0000:c1:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[    6.350889] asus_wmi: ASUS WMI generic driver loaded
[    6.351104] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input5
[    6.351163] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input6
[    6.351210] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input7
[    6.351255] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:c0/0000:c0:01.1/0000:c1:00.1/sound/card0/input8
--
[    7.245953] NVRM: nvAssertFailedNoLog: Assertion failed: 0 @ rmapi.c:935
[    7.246736] NVRM: rmapiReportLeakedDevices: Device object leak: (0xc1e00003, 0xcaf00000). Please file a bug against RM-core.
[    7.246738] NVRM: nvAssertFailedNoLog: Assertion failed: 0 @ rmapi.c:935
[    7.249025] NVRM: nvAssertFailedNoLog: Assertion failed: listCount(&pKernelBus->virtualBar2[gfid].usedMapList) == 0 @ kern_bus_vbar2.c:346
[    7.249266] NVOC: __nvoc_objDelete: Child class KernelVideoEngine not freed from parent class OBJGPU.NVRM: iovaspaceDestruct_IMPL: 4 left-over mappings in IOVAS 0xc100
[    7.249287] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x24:0x72:1100)
[    7.250768] NVRM: GPU 0000:c1:00.0: rm_init_adapter failed, device minor number 0
[    7.250944] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x0000c100] Failed to allocate NvKmsKapiDevice
[    7.251097] [drm:nv_drm_register_drm_device [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x0000c100] Failed to register device
[    7.255076] fbcon: Taking over console
[    7.320072] systemd-journald[852]: Time jumped backwards, rotating.
[   21.130969] NVRM: _kgspBootGspRm: unexpected WPR2 already up, cannot proceed with booting GSP
[   21.130974] NVRM: _kgspBootGspRm: (the GPU is likely in a bad state and may need to be reset)
[   21.130986] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[   21.133459] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x62:0x40:1860)
[   21.135028] NVRM: GPU 0000:c1:00.0: rm_init_adapter failed, device minor number 0
[   21.179855] nvidia-uvm: Loaded the UVM driver, major device number 234.
[   22.327865] kauditd_printk_skb: 126 callbacks suppressed
[   22.327872] audit: type=1400 audit(1742355453.066:138): apparmor="DENIED" operation="open" class="file" profile="snap.docker.nvidia-container-toolkit" name="/var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libc.so.6" pid=2289 comm="nvidia-ctk" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[   22.422600] audit: type=1400 audit(1742355453.160:139): apparmor="DENIED" operation="open" class="file" profile="snap.docker.nvidia-container-toolkit" name="/var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libpthread.so.0" pid=2289 comm="nvidia-ctk" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[   22.423066] audit: type=1400 audit(1742355453.161:140): apparmor="DENIED" operation="open" class="file" profile="snap.docker.nvidia-container-toolkit" name="/var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libm.so.6" pid=2289 comm="nvidia-ctk" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
--
[   22.446131] audit: type=1400 audit(1742355453.184:146): apparmor="DENIED" operation="unlink" class="file" profile="snap.docker.nvidia-container-toolkit" name="/dev/char/195:255" pid=2289 comm="nvidia-ctk" requested_mask="d" denied_mask="d" fsuid=0 ouid=0
[   22.446158] audit: type=1400 audit(1742355453.184:147): apparmor="DENIED" operation="capable" class="cap" profile="snap.docker.nvidia-container-toolkit" pid=2289 comm="nvidia-ctk" capability=21  capname="sys_admin"
[   22.495297] NVRM: _kgspBootGspRm: unexpected WPR2 already up, cannot proceed with booting GSP
[   22.495301] NVRM: _kgspBootGspRm: (the GPU is likely in a bad state and may need to be reset)
[   22.495321] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[   22.497422] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x62:0x40:1860)
[   22.498758] NVRM: GPU 0000:c1:00.0: rm_init_adapter failed, device minor number 0
[   22.550919] NVRM: _kgspBootGspRm: unexpected WPR2 already up, cannot proceed with booting GSP
[   22.550925] NVRM: _kgspBootGspRm: (the GPU is likely in a bad state and may need to be reset)
[   22.550951] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[   22.553640] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x62:0x40:1860)
[   22.555124] NVRM: GPU 0000:c1:00.0: rm_init_adapter failed, device minor number 0
[   22.647494] overlayfs: missing 'lowerdir'
[   25.061217] evm: overlay not supported
[   25.182269] Initializing XFRM netlink socket
[  354.091162] NVRM: _kgspBootGspRm: unexpected WPR2 already up, cannot proceed with booting GSP
[  354.091167] NVRM: _kgspBootGspRm: (the GPU is likely in a bad state and may need to be reset)
[  354.091184] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[  354.094208] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x62:0x40:1860)
[  354.095809] NVRM: GPU 0000:c1:00.0: rm_init_adapter failed, device minor number 0
[  534.431873] NVRM: _kgspBootGspRm: unexpected WPR2 already up, cannot proceed with booting GSP
[  534.431878] NVRM: _kgspBootGspRm: (the GPU is likely in a bad state and may need to be reset)
[  534.431896] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[  534.434467] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x62:0x40:1860)
[  534.436051] NVRM: GPU 0000:c1:00.0: rm_init_adapter failed, device minor number 0

I have an almost identical system, if you’d like me to check anything specific let me know:

System Configuration

  • Motherboard: ASUS Pro WS WRX90E-SAGE SE
  • BIOS Version: 0803 (Released 2024/10/15)
  • CPU: Ryzen Threadripper PRO 7965WX
  • GPU: 2x Nvidia RTX 5090 Founder’s Edition
  • Operating System: Ubuntu 24.10 (oracular)
  • Kernel: 6.13.5-061305-generic
  • Driver: Linux 64-bit 570.124.06

BIOS Settings

  • Secure Boot: Disabled
  • CSM: Disabled
  • Resize BAR Support: Disabled
  • IOMMU: Enabled

So far I’ve found that my second GPU does not work properly with ResizableBar enabled. 1 GPU is detected and works fine. With it disabled they are both detected and work.

I was seeing similar errors to this when reBar was enabled, they went away when disabling it:

[ 354.091162] NVRM: _kgspBootGspRm: unexpected WPR2 already up, cannot proceed with booting GSP
[ 354.091167] NVRM: _kgspBootGspRm: (the GPU is likely in a bad state and may need to be reset)
[ 354.091184] NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 354.094208] NVRM: GPU 0000:c1:00.0: RmInitAdapter failed! (0x62:0x40:1860)
[ 354.095809] NVRM: GPU 0000:c1:00.0: rm_init_adapter failed, device minor number 0

austin@ripper:~$ nvidia-smi --version
NVIDIA-SMI version : 570.124.06
NVML version : 570.124
DRIVER version : 570.124.06
CUDA Version : 12.8

austin@ripper:~$ egrep nvidia /proc/modules
nvidia_uvm 2158592 0 - Live 0x0000000000000000 (OE)
nvidia_drm 131072 48 - Live 0x0000000000000000 (OE)
nvidia_modeset 1724416 12 nvidia_drm, Live 0x0000000000000000 (OE)
nvidia 11620352 286 nvidia_uvm,nvidia_modeset, Live 0x0000000000000000 (OE)
drm_ttm_helper 16384 1 nvidia_drm, Live 0x0000000000000000
video 77824 2 nvidia_modeset,asus_wmi, Live 0x0000000000000000

1 Like

Thank you! That’s really helpful.
I’ll update the kernel and give it a try!

  • Operating System: Ubuntu 24.10 (oracular)
  • Kernel: 6.13.5-061305-generic

After updating the OS and adjusting the BIOS settings, it was recognized!
Thank you so much! Big thanks to everyone! :)

$ lspci -nn | grep -i nvidia
e1:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2b85] (rev a1)
e1:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)
$ nvidia-smi --version
NVIDIA-SMI version  : 570.133.07
NVML version        : 570.133
DRIVER version      : 570.133.07
CUDA Version        : 12.8
$ uname -a
Linux marimo 6.11.0-19-generic #19-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 12 21:43:43 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 24.10"
NAME="Ubuntu"
VERSION_ID="24.10"
VERSION="24.10 (Oracular Oriole)"
VERSION_CODENAME=oracular
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=oracular
LOGO=ubuntu-logo
$ nvidia-smi
Thu Mar 20 10:22:38 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.07             Driver Version: 570.133.07     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5090        Off |   00000000:E1:00.0 Off |                  N/A |
|  0%   37C    P8              9W /  600W |       2MiB /  32607MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Hi, what changes did you make to Bios?

It now works with these settings! :)

BIOS

BIOS Version: 0803 (Released 2024/10/15)
Configured from default settings.
Secure Boot: Disabled
CSM: Disabled
Resize BAR Support: Disabled
IOMMU: Enabled

Updated OS to 24.10

Operating System: Ubuntu 24.10 (oracular)
Kernel: 6.11.0-19-generic

NVIDIA Driver

Linux x64 (AMD64/EM64T) Display Driver 570.133.07 | Linux 64-bit

System Configuration

  • Motherboard: ASUS Pro WS WRX90E-SAGE SE
  • BIOS Version: 0803 (Released 2024/10/15)
  • CPU: Ryzen Threadripper PRO 7965WX
  • GPU: ZOTAC GAMING GeForce RTX 5090 AMP Extreme INFINITY
  • Operating System: Ubuntu 24.10 (oracular)
  • Kernel: 6.11.0-19-generic
  • Driver: Linux 64-bit 570.133.07

Hello,

I hope you are doing well, I’m reaching out concerning the same issue that you had recently with the RTX 5090 on Ubuntu. We have exactly the same motherboard, with 2x GeForce RTX 5090 SOLID DC Edition ZOTAC Gaming 32 GB. We tried doing the same setup as you did, meaning we changed the settings in the BIOS and installed the latest drivers, it worked with 1 GPU, when we added the second GPU we had an error while booting in Ubuntu I will attach it in this message. However, when we removed one of the RTX 5090, even while re-installing Ubuntu we’re unable to make it work, the driver doesn’t work, but I can confirm that it’s not a hardware defect as it’s working properly in Windows 11. We tried with the latest version of GCC 14, we tried with kernel 6.11 and with 6.13.5 and that didn’t change anything. We even installed steam 32-bit as that seems to help but it didn’t. We updated the bios, tried and rolled back, no success. Same for the RAM we had 256 GB we tried with 128 as well. Did you do any other changes in the BIOS? Let me know if you have any recommendation.

Thank you