Unable to determine the device handle for GPU 0000:18:00.0: GPU is lost. Reboot the system to recover this GPU

Hi,
I facing one issue in NVidia TRX 2080TI,i attached the error snap,pls kindly check and update.

[root@hpc openmpi]# nvidia-smi
Unable to determine the device handle for GPU 0000:18:00.0: GPU is lost. Reboot the system to recover this GPU

[root@hpc openmpi]# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

[root@hpc openmpi]# nvidia-smi -i 0
Unable to determine the device handle for GPU 0000:18:00.0: GPU is lost. Reboot the system to recover this GPU

[root@hpc openmpi]# nvidia-smi -i 3
Wed Jun 23 11:30:34 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67 Driver Version: 460.67 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 3 GeForce RTX 208… Off | 00000000:AF:00.0 Off | N/A |
| 0% 44C P8 21W / 250W | 4MiB / 11019MiB | …

[root@hpc openmpi]# nvidia-smi -i 1
Wed Jun 23 11:30:27 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67 Driver Version: 460.67 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 1 GeForce RTX 208… Off | 00000000:3B:00.0 On | N/A |
| 36% 39C P8 16W / 250W | 255MiB / 11019MiB | …
[10:50 AM, 7/10/2021] பணம் $$$$: [root@hpc openmpi]# nvidia-smi -i 2
Wed Jun 23 11:30:31 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67 Driver Version: 460.67 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 2 GeForce RTX 208… Off | 00000000:86:00.0 Off | N/A |
| 36% 38C P8 21W / 250W | 4MiB / 11019MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+