XID79 error occurs, GPU Fallen, with TensorRT running on RTXA4000 and X11 mainboard on Ubuntu18.04
nvidia bug file:
nvidia-bug-report.log (2.0 MB)
System log:
BTW, I checked the temperature of GPU is always normal under 60C
XID79 error occurs, GPU Fallen, with TensorRT running on RTXA4000 and X11 mainboard on Ubuntu18.04
nvidia bug file:
nvidia-bug-report.log (2.0 MB)
System log:
BTW, I checked the temperature of GPU is always normal under 60C
You’re getting a XID 79, fallen off the bus. Most common reasons are overheating or lack of power. Monitor temperatures, reseat power connectors/the card in its slot, check/replace PSU.
To check for power issues, you can use nvidia-smi -lgc to prevent boost situations, e.g.
nvidia-smi -lgc 300,1500
Thank you for your reply, wo are going to replace PSU.