This problem has been happening almost every day recently and has seriously affected my work. The system only returns to normal briefly each time it restarts.
bugs:
May 25 23:40:57 xmu-NMR kernel: [ 1161.724285] NVRM: GPU at PCI:0000:89:00: GPU-4723a2af-e63f-10f1-256e-7b1bc2aa6692
May 25 23:40:57 xmu-NMR kernel: [ 1161.724288] NVRM: GPU Board Serial Number:
May 25 23:40:57 xmu-NMR kernel: [ 1161.724291] NVRM: Xid (PCI:0000:89:00): 79, pid=7147, GPU has fallen off the bus.
May 25 23:40:57 xmu-NMR kernel: [ 1161.724294] NVRM: GPU 0000:89:00.0: GPU has fallen off the bus.
May 25 23:40:57 xmu-NMR kernel: [ 1161.724295] NVRM: GPU 0000:89:00.0: GPU is on Board .
May 25 23:40:57 xmu-NMR kernel: [ 1161.724324] NVRM: A GPU crash dump has been created. If possible, please run
May 25 23:40:57 xmu-NMR kernel: [ 1161.724324] NVRM: nvidia-bug-report.sh as root to collect this data before
May 25 23:40:57 xmu-NMR kernel: [ 1161.724324] NVRM: the NVIDIA kernel module is unloaded.
nvidia-bug-report.log.gz (3.6 MB)