1 out of 4 GPUs is suddenly gone

I was using 4 GPUs for deep learning.

One day “Unable to determine the device handle for GPU 0000:0A:00.0: GPU is lost. Reboot the system to recover this GPU” message poped up,

and I couldn’t find my 4th GPU after reboot…

I’ll attach system log of nvidia-bug-report.sh, please help…nvidia-bug-report.log.gz (717.8 KB)