Monitor loses signal and turns off when playing games

I started to experience blackscreens (monitor loses signal and turns off) while playing games. During this, I can ssh in and collect logs, but I can’t kill lightdm or remotely reboot the machine. I have to use sysrq keys to reboot the machine.

This can occur during many games, but recently has been while playing Mordhau. These crashes began around the end of December, and I believe the Manjaro Dec 30th update may have something to do with it. I have kept an eye on temperatures, so I don’t believe it’s thermals

I am on Manjaro XFCE, and I have tried a combination of:

kernels:

  • 5.4
  • 5.10

Graphics Driver Versions:

  • 435
  • 440
  • 450
  • 460

This issue persists with all combinations, with the same error in journalctl:

Jan 26 14:45:14 GamerPC kernel: NVRM: Xid (PCI:0000:01:00): 79, pid=863, GPU has fallen off the bus.
Jan 26 14:45:14 GamerPC kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
Jan 26 14:45:14 GamerPC kernel: NVRM: A GPU crash dump has been created. If possible, please run nvidia-bug-report.sh as root to collect this data before the NVIDIA kernel module is unloaded.

I have included two bug reports, one for the 435 driver and one for the 450 driver

nvidia-bug-report-1.log.gz (561.2 KB)
nvidia-bug-report-2.log.gz (551.6 KB)

XID 79 on desktops is most times caused by either overheating or insufficient power (failing psu). Low chances of failing gpu or mainboard.
You can log temperatures using nvidia-smi -q -d TEMPERATURE -l 2 -f temp.log
Otherwise, check your psu.

I was able to produce another blackscreen, and the temperature never went above 81C, and mostly hovered around 80C

temp.log (644.6 KB)

Temperatures look good. Please check your psu, sometimes it helps to unplug/replug the power connectors on the graphics card to increase conductivity.