RTX 5060 Ti (OCuLink) GPU Crashes After GSP Init on 575.51.02 – Linux Kernel 6.8 / Open Kernel Module
Hardware Setup:
GPU: NVIDIA GeForce RTX 5060 Ti (Ada, via OCuLink dock)
Host: Minisforum MS-01, Intel Raptor Lake CPU, 96GB DDR5
Bus: PCIe x4 via OCuLink
Power Supply: Confirmed stable 300W+ (dedicated dock PSU)
Kernel: Proxmox VE 8.4.1 with Linux 6.8.12-10-pve
Driver Versions Tried:
575.51.03 BETA (open kernel module)
570.153.02 (open kernel module)
575.51.02 BETA (open kernel module)
Symptoms:
After clean install of either driver using the --kernel-module-type=open flag:
nvidia-smi shows GPU for 10–30 seconds
GPU draws power (~25W), appears in PCIe bus and kernel modules
Then driver crashes: nvidia-smi reports “No devices were found”
dmesg shows repeated kgspRpcRecvPoll and gpuStatePreInit_IMPL stack traces
Logs (excerpt):
[ 49.922114] _issueRpcAndWait+0xd2/0x900 [nvidia]
[ 49.922351] rpcRmApiControl_GSP+0x76f/0x940 [nvidia]
[ 49.922738] gpumgrStatePreInitGpu+0x6b/0xa0 [nvidia]
[ 49.922845] RmInitAdapter+0x12b9/0x1da0 [nvidia]
What I’ve tried:
Full power downs (not just reboot)
Both stable and beta drivers
Kernel boot params:
nvidia.NVreg_EnableGpuFirmware=0
NVreg_Modeset=0
Verified that device remains on PCIe bus (lspci still shows GPU)
No inference or CUDA load initiated before crash
No success with proprietary module — 5060 Ti refuses it
Current Result:
RTX 5060 Ti is briefly initialized, then drops out completely
Likely failing inside GSP firmware init process
Request:
Is this a known firmware issue with Ada GPUs in Linux?
Is updated GSP firmware expected for 5060 Ti?
Any known workaround to keep the module alive post-boot?
Here is the journal grep'ed for nvidia:
May 21 20:29:46 minis02 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-10-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt nvidia.NVreg_EnableGpuFirmware=0
May 21 20:29:46 minis02 kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-10-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt nvidia.NVreg_EnableGpuFirmware=0
May 21 20:29:47 minis02 kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
May 21 20:29:47 minis02 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 510
May 21 20:29:47 minis02 kernel: nvidia 0000:01:00.0: enabling device (0000 -> 0003)
May 21 20:29:47 minis02 kernel: nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
May 21 20:29:47 minis02 kernel: NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 575.51.02 Release Build (dvs-builder@U22-I3-G01-3-2) Thu Apr 10 15:55:07 UTC 2025
May 21 20:29:47 minis02 kernel: nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64 575.51.02 Release Build (dvs-builder@U22-I3-G01-3-2) Thu Apr 10 15:40:53 UTC 2025
May 21 20:29:47 minis02 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
May 21 20:29:47 minis02 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
May 21 20:29:47 minis02 kernel: input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input4
May 21 20:29:47 minis02 kernel: input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input5
May 21 20:29:47 minis02 kernel: input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input6
May 21 20:29:47 minis02 kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input7
May 21 20:29:48 minis02 kernel: audit: type=1400 audit(1747884588.051:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=768 comm="apparmor_parser"
May 21 20:29:48 minis02 kernel: audit: type=1400 audit(1747884588.051:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=768 comm="apparmor_parser"
May 21 20:30:34 minis02 kernel: os_dump_stack+0xe/0x20 [nvidia]
May 21 20:30:34 minis02 kernel: _kgspRpcRecvPoll+0x593/0x760 [nvidia]
May 21 20:30:34 minis02 kernel: _issueRpcAndWait+0xd2/0x900 [nvidia]
May 21 20:30:34 minis02 kernel: ? osGetCurrentThread+0x26/0x60 [nvidia]
May 21 20:30:34 minis02 kernel: ? os_mem_set+0x14/0x20 [nvidia]
May 21 20:30:34 minis02 kernel: rpcRmApiControl_GSP+0x76f/0x940 [nvidia]
May 21 20:30:34 minis02 kernel: kmemsysStatePreInitLocked_IMPL+0x72/0x100 [nvidia]
May 21 20:30:34 minis02 kernel: gpuStatePreInit_IMPL+0x5aa/0xac0 [nvidia]
May 21 20:30:34 minis02 kernel: gpumgrStatePreInitGpu+0x6b/0xa0 [nvidia]
May 21 20:30:34 minis02 kernel: RmInitAdapter+0x12b9/0x1da0 [nvidia]
May 21 20:30:34 minis02 kernel: ? _portMemAllocatorAlloc+0x2b/0xe0 [nvidia]
May 21 20:30:34 minis02 kernel: rm_init_adapter+0xad/0xc0 [nvidia]
May 21 20:30:34 minis02 kernel: nv_open_device+0x41d/0x9c0 [nvidia]
May 21 20:30:34 minis02 kernel: nvidia_open_deferred+0x39/0xf0 [nvidia]
May 21 20:30:34 minis02 kernel: _main_loop+0x7f/0x140 [nvidia]
May 21 20:30:34 minis02 kernel: ? __pfx__main_loop+0x10/0x10 [nvidia]