RTX 5090 (GB202): Spontaneous GSP heartbeat timeout / Xid 79 "GPU has fallen off the bus" under Vulkan load and idle

System Information:

  • GPU: NVIDIA GeForce RTX 5090 (GB202, PCI 0000:01:00.0)

  • CPU: AMD Ryzen 9 9950X3D (AMD iGPU also present: Radeon RAPHAEL_MENDOCINO)

  • OS: Kubuntu 25.10 (Questing Quokka)

  • Kernel: 6.17.0-20-generic

  • Driver: 595.58.03 (nvidia-open-dkms, installed via .run)

  • Desktop: KDE Plasma 6 / Wayland

  • Display: 1, Connected to RTX 5090 via DisplayPort

Problem Description:

The system experiences hard crashes, the screen gets black while the audio continues for a bit before that also hangs. A hard reboot is the only way to recover from it.
The GPU seems to spontaneously loose connection to the PCIe bus. This occurs both under Vulkan load (Happens both with Proton games via Steam like Warhammer: Darktide but also Native games like Dota 2). It also happens at idle (no game running) with minimal GPU activity. Sometimes it can go hours between crashes and sometimes several crashes can occur within minutes. Other applications running during this was firefox and discord.

The system was completely stable before March 2026 (580.126.09-0ubuntu0.25.10.1). Crashes began after a kernel + driver update around that time.

Kernel log on 590.48.01:

NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus.
NVRM: GSP RPC timeout

Kernel log on 595.58.03:

NVRM: GPU0 _kgspRpcRecvPoll: GSP RM heartbeat timed out
NVRM: GPU0 _kgspRpcRecvPoll: LibOS heartbeat timed out
NVRM: GPU0 GSP_LOCKDOWN_NOTICE
NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus.
NVRM: Xid (PCI:0000:01:00): 154, GPU recovery action changed to Node Reboot Required

What I have tried:

  • Upgraded from 590.48.01 to 595.58.03 — problem persists on both

  • PROTON_VKD3D_HEAP=1 launch option — might reduce frequency, hard to tell. Seems to greatly reduce GPU load in Darktide atleast.

  • VKD3D_CONFIG=no_upload_hvv — no effect

  • Disabled Discord hardware acceleration

  • nvidia-powerd disabled (leftover service from 590)

  • Removed old 590 library packages (libnvidia-gl-590:i386, libnvidia-common-590, etc.)

  • Disabled PCIe ASPM (pcie_aspm=off) — no effect

  • GPU power limit set to 400W — no effect

  • Removed ~/.config/kwinoutputconfig.json — addressed KWin crashes but not Xid 79

I’m starting to reach my wits end regarding this. I doubt that Ubuntu 26.04 will be able to fix this issue. Any ideas or tips is greatly appreciated.

nvidia-bug-report.gz (169.1 KB)

Potentially related issue, supposedly an AGESA downgrade solves it - Bug Report & Fix: RTX 5090 — Xid 79 GSP Firmware Crash Under Sustained CUDA Load - #2 by consume