GPU Titan Xp Fan Error


Our Titan XP ran into fan errors many times. After restarting the server, it still can work very well. Does anyone have insights on it?

Error message is here:

Some system configures are

Ubuntu 16.04.3 LTS
CUDA version 9.0.176

nvidia-bug-report.log.gz (307 KB)

Can you check with a more current driver? If it still fails, please run as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.

Hi generix,

I have uploaded the nvidia-bug-report.log.gz file. Please help check it. Other lab members are using the current driver. If the problem comes with the driver, we can update it then.
Looks like it starts when the driver hits an XID 62 and a display engine timeout. This might be due to the Titan is also used to run X while running high cuda loads. Try disabling the Xserver after reboot and check if the issue still appears.