Hi everyone,
I’m having issues with my second NVIDIA RTX PRO 6000 GPU failing to initialize after it started showing errors. Looking for advice on next steps.
System Configuration:
- Motherboard: ASUS Pro WS TRX50-SAGE WIFI
- CPU: AMD Ryzen Threadripper 7970X 32-Core
- RAM: 256GB (4x64GB) DDR5-4800 ECC
- GPUs: 2x NVIDIA RTX PRO 6000 (identical cards)
- Driver: NVIDIA 580.65.06 (Ubuntu)
- OS: Ubuntu
Timeline of the issue:
- Both GPUs were working normally
- Second GPU (PCI 83:00.0) started showing “ERR!” status in nvidia-smi across all fields (temp, power, memory usage, etc.)
- After reboot, the second GPU completely disappeared from nvidia-smi output
- First GPU (PCI 41:00.0) continues working
Current Status:
- First GPU: Working normally, shows in nvidia-smi
- Second GPU: Shows BIOS screen during boot but goes blank once Ubuntu loads, detected by system (
lspcishows both cards), but fails driver initialization
Error Messages (repeated in logs):
NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
What I’ve tried:
- Complete driver reinstallation (purged all nvidia packages, clean install), power off, on, etc
- Tested different driver versions (570-open, 580-open)
Current lspci output shows both GPUs:
41:00.0 VGA compatible controller: NVIDIA Corporation Device 2bb1 (rev a1)
83:00.0 VGA compatible controller: NVIDIA Corporation Device 2bb1 (rev a1)
Questions:
- Is GSP firmware error 0xb indicative of hardware failure, or could this be a driver/firmware corruption issue?
- Since the card shows BIOS screen but goes blank when Ubuntu loads, is this a driver initialization problem rather than complete hardware failure?
- Are there any other troubleshooting steps for GSP firmware boot failures?
- Could this be related to dual-GPU configuration or PCIe slot issues?
The fact that the GPU can still display BIOS screen gives me hope it’s not completely dead, but something is preventing the NVIDIA driver from properly initializing it once the OS loads.
Any guidance would be greatly appreciated!
Thanks in advance.
kernel: NVRM: kgspInitRm_IMPL: Initial shift, 4, is larger than max allowed [0, 3]. Modulo applied
kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
kernel: NVRM: kgspInitRm_IMPL: Max GSP-RM boot attempts exceeded: 4/4
kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1993)
kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 19.932845] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 19.932888] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 19.935717] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 19.935728] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 19.936628] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 20.227666] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 20.227703] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 20.229868] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 20.229878] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 20.230798] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 21.837236] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 21.837297] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 21.839818] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 21.839834] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 21.841088] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 21.856849] kernel: rfkill: input handler disabled
[ 22.132198] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 22.132432] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 22.134868] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 22.134882] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 22.136023] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 22.427639] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 22.429258] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 22.432223] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 22.432236] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 22.433195] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 22.727990] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 22.728754] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 22.731707] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 22.731718] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 22.732736] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 23.019646] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 23.020179] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 23.023186] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 23.023197] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 23.024123] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
[ 23.321896] kernel: NVRM: kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP: 0xb
[ 23.323061] kernel: NVRM: RmInitAdapter: Cannot initialize GSP firmware RM
[ 23.325682] kernel: NVRM: iovaspaceDestruct_IMPL: 1 left-over mappings in IOVAS 0x8300
[ 23.325694] kernel: NVRM: GPU 0000:83:00.0: RmInitAdapter failed! (0x62:0x55:1859)
[ 23.326634] kernel: NVRM: GPU 0000:83:00.0: rm_init_adapter failed, device minor number 0
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.65.06 Driver Version: 580.65.06 CUDA Version: 13.0 |
±----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX PRO 6000 Blac… On | 00000000:41:00.0 Off | Off |
| 30% 37C P8 16W / 600W | 2262MiB / 97887MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2552 G /usr/bin/gnome-shell 16MiB |
| 0 N/A N/A 2885 G /usr/bin/Xwayland 12MiB |
| 0 N/A N/A 4007 C python 2204MiB |
±----------------------------------------------------------------------------------------+