Unable to determine the device handle for GPU 0000:0A:00.0: GPU is lost. Reboot the system to recover this GPU

Doing some deep learning experiments with PyTroch 1.6 crashed my TITAN V. Anyone else having similar problems? Any suggestion to solve it?
The nvidia bug report is attached.
nvidia-bug-report.log.gz (237.0 KB)