Performance level throttles down to 0 triggers Xid 79 `GPU has fallen off the bus'

Hi,

I see this issue was reported several times and people always answer:

  • your case is not properly cooled
  • your GPU is not well seated
  • your PSU is faulty

I can say none of this applies to my machine.
Gaming on windows 10 on the very same machine runs flawlessly (same programs).

Xid 79 occurs after putting heavy load on GPU (GTX 1080). Its temp never exceeded 40C as it’s nicely watercooled.
PowerMizer shows that after program is shut down, PowerMizer throttles down performance level from 3 to 0 over a period of 20-30sec
Once level 0 is reached, I wait another 5 sec and GPU falls off the bus.

With each driver update (465.24) or kernel update (5.11), I feel the frequency is worsening.
The classic pcie_aspm=off has no effect.

Please let me know if you need additional details.
My machine is a pre-built CS-9000015-EU

Cheers,

nvidia-bug-report.log.gz (1.1 MB)

Please try setting kernel parameter
intel_idle.max_cstate=1