Single-GPU desktop machine, MSI RTX 2070 Super Ventus OC
Linux Mint 20.2 (Ubuntu 20.04 base)
Nvidia drivers 460 and 470
As of three weeks ago and only in a single game I’ve been suffering from a sudden screen freeze, seemingly unresponsive system (accessible via SSH) and the following message in my kernel log:
kernel: NVRM: GPU at PCI:0000:01:00: GPU-64d0d0f6-0601-b9dd-cdc2-042b524a850b
kernel: NVRM: Xid (PCI:0000:01:00): 79, pid=1651, GPU has fallen off the bus.
kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
Things I’ve tried:
- A full re-install of the game and clearing of the cache (including shader cache)
- Change of driver to a 460 variant
- Change of kernel
- nvidia-drm.modeset=0
- pcie_aspm=off
- Changing the card power management via nvidia-x-server-settings
Issue does not seem to present under a GPU stress test.
Any suggestions, possible causes, further steps I can take in diagnosing? All would be welcome.