GT 720 driver on Fedora hangs frequently in GNOME, EQ overflow

About one time in four, when I boot up the machine and log into GNOME, all I get is a dark grey screen with a mouse cursor on it. If I successfully log in, the card works pretty well for a while - could be many days, but might be only half an hour - and then eventually it will suddenly hang.

In the X log, there’s this:

(EE) [mi] EQ overflowing. Additional events will be discarded until existing events are processed.

then this:

(EE) [mi] mieq is NOT the cause. It is a victim.
(EE) [mi] EQ overflow continuing. 100 events have been dropped.

then this:

[ 39801.202] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x00001d2c, 0x00001d34)
[ 39808.202] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x00001d2c, 0x00001d34)

and this:

[ 39808.204] [mi] Increasing EQ size to 1024 to prevent dropped events.
[ 39808.205] [mi] EQ processing has resumed after 2177 dropped events.
[ 39808.205] [mi] This may be caused my a misbehaving driver monopolizing the server’s resources.

and finally more of this:

[ 39848.336] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x00001d2c, 0x00001d3c)
[ 39855.336] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x00001d2c, 0x00001d3c)
[ 40252.395] (WW) NVIDIA(0): WAIT (2-S, 17, 0x07e2, 0x00001d2c, 0x00001de0)

The graphics system is essentially dead until the machine is reset, though occasionally it might be possible to kill the X server and restart it. Usually though I’ll need to shut the machine down completely, and occasionally it won’t even go down and has to be hard-reset with the power button.

This is really frustrating, as the crash has happened three times so far tonight. :-(
nvidia-bug-report.log.gz (85.8 KB)

The errors here indicate that the GPU simply stopped processing new commands, but didn’t otherwise report an error. Please make sure that the GPU is firmly seated in the PCIe socket, that the heatsink is free of dust and debris, and that there is adequate airflow over the heatsink. You could also try running nvidia-settings and watching the thermal info page to see if the hangs are correlated with high GPU temperatures.