Screens go blank during game play with less than a year old GTX970

I’m running Linux Mint 18.3 (Sarah), Cinnamon
Kernel 4.4.0-59
I’ve got an EVGA GTX970 which is less than a year old.
I have 1 samsung C24F390 on HDMI and 2 Philips PHL 223V5 on DVI and Displayport<-DVI Converter respectively. All screens at 1920*1080.

Getting same result with 361, 370, 375, 378 drivers

My system will work fine for days even weeks if I don’t do anything heavy with it. But with graphics intensive games such as KSP, X-Plane, Deus-EX, rocket league etc all three screens will suddenly go blank and turn off. The system is still operating. Games are still running, audio works and if I’m using voice comms on TS2 I can still chat to friends.

I can Ctrl+alt+T sudo reboot to reboot it so it’s accepting input.

I tailed the syslog from my latop and when the screens turned off I got this :

Feb 1 22:30:16 Morpheus kernel: [ 1406.722233] NVRM: GPU at PCI:0000:01:00: GPU-ea04eb6a-2191-7f15-0278-ea7d9eda741a
Feb 1 22:30:16 Morpheus kernel: [ 1406.722237] NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus.
Feb 1 22:30:16 Morpheus kernel: [ 1406.722237]
Feb 1 22:30:16 Morpheus kernel: [ 1406.722239] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Feb 1 22:30:16 Morpheus kernel: [ 1406.722516] NVRM: A GPU crash dump has been created. If possible, please run
Feb 1 22:30:16 Morpheus kernel: [ 1406.722516] NVRM: nvidia-bug-report.sh as root to collect this data before
Feb 1 22:30:16 Morpheus kernel: [ 1406.722516] NVRM: the NVIDIA kernel module is unloaded.
Feb 1 22:30:16 Morpheus kernel: [ 1406.722523] NVRM: Xid (PCI:0000:01:00): 58, EDC 0
nvidia-bug-report.log.gz (228 KB)

  1. Check temperatures
  2. Remove overclocking (including for CPU/System RAM/FSB/etc.)
  3. Reseat the GPU in the motherboard or place it in another slot
  4. Check your PSU - try using another one.

If nothing above helps, then I guess your videocard is faulty.

I agree with birdie. This sounds like a physical problem with the GPU or its connection to the rest of the system.

It also can’t hurt to run memtest86+ on the system. I don’t expect it to find anything relevant here, but it’s always a good idea to run it from time to time just in case.

I’m experiencing the same error runnibg Ubuntu 16.10 with an EVGA Geforce GTX 1080.

But the problem only happens with some games, I’ve been playing Deus Ex: Human Revolution and Mad Max at maximum quality for hours without problem, but with some games like Victor Vran and XCom2 it hangs with the same error described above.

I managed to fix this issue and it was hardware but not the card.

I have another computer with Mint 18 on so I put the card in there, installed Steam and a couple of games. It ran fine for the hour that I tested it.

So I decided to go with the low hanging fruit and replaced my PSU. Since then it’s been working stable and I haven’t had the blank screen issue again.

Thanks for the info. I changed the video card to another slot an Victor Vran didn’t crash (I must test it a little more to be sure), but recently XCOM2 crashed again but this time it gave more info:

feb 28 23:00:45 robocudo3 kernel: NVRM: GPU at PCI:0000:02:00: GPU-b68402ed-8a1d-bd2b-a33c-7178c2ee5516
feb 28 23:00:45 robocudo3 kernel: NVRM: GPU Board Serial Number: 
feb 28 23:00:45 robocudo3 kernel: NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: ESR 0x405848=0x80000000
feb 28 23:00:45 robocudo3 kernel: NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: Shader Program Header 9 Error
feb 28 23:00:45 robocudo3 kernel: NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: ESR 0x405840=0x82000200
feb 28 23:00:45 robocudo3 kernel: NVRM: Xid (PCI:0000:02:00): 13, Graphics Exception: ChID 0026, Class 0000c197, Offset 00002390, Data 00000000
feb 28 23:00:45 robocudo3 kernel: NVRM: Xid (PCI:0000:02:00): 79, GPU has fallen off the bus.
feb 28 23:00:45 robocudo3 kernel: NVRM: GPU at 0000:02:00.0 has fallen off the bus.
feb 28 23:00:45 robocudo3 kernel: NVRM: GPU is on Board .
feb 28 23:00:45 robocudo3 kernel: NVRM: A GPU crash dump has been created. If possible, please run
                                  NVRM: nvidia-bug-report.sh as root to collect this data before
                                  NVRM: the NVIDIA kernel module is unloaded.

I have nvidia 378.13 driver installed.