Unable to determine the device handle for GPU 0000:04:00.0: Unknown Error

While I am running the model, The error is show GPU is Lost. So I am typing nvidia-smi in my terminal, there is an error Unable to determine the device handle for GPU 0000:04:00.0: Unknown Error.
Here is the detailed info of bug report
nvidia-bug-report.log.gz (146.9 KB)

Here is the capture from nvidia-smi -l when my running model is freeze
Screenshot from 2022-08-31 11-50-34

This is log from dmesg | grep GPU

[    6.093487] [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver
[  746.393736] NVRM: GPU at PCI:0000:04:00: GPU-1a3d7ae0-0472-8841-7f36-3868e951e6ee
[  746.393747] NVRM: Xid (PCI:0000:04:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
[  746.393750] NVRM: GPU 0000:04:00.0: GPU has fallen off the bus.
[  746.393865] NVRM: GPU 0000:04:00.0: GPU serial number is \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff.
[  746.393880] NVRM: A GPU crash dump has been created. If possible, please run
[ 1162.350339] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1162.350412] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1162.350535] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1162.350602] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1162.350723] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1624.148957] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1624.148999] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1624.149067] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1624.149104] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices
[ 1624.149174] nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices

https://forums.developer.nvidia.com/t/unable-to-determine-the-device-handle-for-gpu-000000-0-gpu-is-lost-reboot-the-system-to-recover-this-gpu/176891/2?u=generix