DGX-1 GPUs not accessible anymore

We are facing a critical issue with the GPUs on our DGX-1. None of the GPUs are listed in lspci | grep nvidia and therefore, cannot be accessed by the OS.

OS: Ubuntu 16.04.3 LTS

The issue occurred after a GPU_Overtemp error caused the machine to shut down. When we restarted the machine, the GPUs were no longer accessible. Can someone suggest a fix? This is the entry in the SEL logs.

GPU_Overtemp     | Temperature                 | State Asserted
PCIE Error       | Critical Interrupt          | Bus Fatal Error ; OEM Event Data2 code = 10h ; OEM Event Data3 code = 80h
N/A              | N/A                         | OEM defined = 86h 80h 04h 6Fh 21h 24h

How did you fix this? We are facing the same issue right now