Randomly complete freezes with artifacts and error Xid 62

Sometimes it can take a week to happen, but sometimes it happens all day.

The computer completely freezes with pink artifacts all over the screens. When I reboot, sometimes it shows the same artifacts even in the UEFI Boot.

Here i ran a grep looking for NVRM today.

fev 08 07:45:51 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 07:45:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=861, 0000(0000) 00000000 00000000
fev 08 07:45:55 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=803, Ch 00000000
fev 08 07:45:55 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=803, Ch 00000001
fev 08 07:45:55 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=454, Ch 00000002
-----------------------------
fev 08 07:46:19 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:00:37 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:00:37 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=911, 0000(0000) 00000000 00000000
-----------------------------
fev 08 08:01:06 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:01:34 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:01:34 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=931, 0000(0000) 00000000 00000000
-----------------------------
fev 08 08:02:22 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:03:07 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:03:07 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=941, 0000(0000) 00000000 00000000
-----------------------------
fev 08 08:03:38 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:04:30 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:05:22 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:05:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=950, 0000(0000) 00000000 00000000
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: Shader Program Header 1 Error
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: Shader Program Header 2 Error
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: Shader Program Header 9 Error
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: ESR 0x405840=0x82000206
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: ESR 0x405848=0x80000000
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=2092, Graphics Exception: ChID 0018, Class 0000c797, Offset 00000100, Data 00000000
fev 08 08:05:25 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 31, pid=2092, Ch 00000018, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_6 faulted @ 0x1_0400a000. Fault is of type FAULT_UNSUPPORTED_APERTURE ACCESS_TYPE_VIRT_READ
-----------------------------
fev 08 08:05:49 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:08:34 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:08:34 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=912, 0000(0000) 00000000 00000000
fev 08 08:08:42 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 8, pid=476, Channel 00000002
fev 08 08:08:44 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 31, pid=854, Ch 00000001, intr 00000000. MMU Fault: ENGINE HOST0 HUBCLIENT_ESC faulted @ 0x1_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000000
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000001
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=476, Ch 00000002
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000008
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000009
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000018
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000019
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 0000001a
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000000
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000001
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=476, Ch 00000002
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000008
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000009
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000018
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000019
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 0000001a
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000000
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000001
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=476, Ch 00000002
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000008
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000009
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000018
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000019
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 0000001a
-----------------------------
fev 08 08:09:26 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:09:31 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x25:0x65:1451)
fev 08 08:09:31 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x25:0xffff:1451)
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
-----------------------------
fev 08 08:10:32 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:14:31 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:14:31 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 44, pid=1799, Ch 00000010, intr 00000000
fev 08 08:14:31 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=1799, 0000(0000) 00000000 00000000
fev 08 08:15:00 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022
fev 08 08:15:52 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  510.47.03  Mon Jan 24 22:58:54 UTC 2022

Sometimes after the Xid I get a “RT sched throttling” as well or “nvidia-modeset: ERROR: GPU:0: Failed to idle DMA.”

I already ran memtest86 and cuda_memtest on this machine with no problem.

Here the nvidia-bug-report, but it was ran after reboot.

nvidia-bug-report.log.gz (259.5 KB)

Rather looks like the gpu might be broken, please try reseating it in its pcie slot, then run gpu_burn to test it.

Thanks for the answer! I already tried reseating it a few times.

I tried changing my PXIE_16 mode from auto to GEN3, it seemed to make it a little bit more stable (a few hours) but the crash happened again, this time with different error order.

This time it started with a Xid 13, then proceeded to a Xid 62, then Xid 45. It usually starts with a Xid 62.

fev 08 11:17:20 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 11:17:20 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=920, Graphics Exception: Shader Program Header 1 Error
fev 08 11:17:20 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=920, Graphics Exception: ESR 0x405840=0xa0000002
fev 08 11:17:20 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=920, Graphics Exception: ESR 0x405848=0x80000000
fev 08 11:17:20 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=2022, Graphics Exception: ChID 0010, Class 0000c797, Offset 00000100, Data 00000000
fev 08 11:17:20 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=2022, 0000(0000) 00000000 00000000
fev 08 11:17:21 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000000
fev 08 11:17:21 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000001
fev 08 11:17:21 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=454, Ch 00000002
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000008
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000009
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000010
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000011
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000012
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000013
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000014
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000015
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2679, Ch 00000018
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2679, Ch 00000019
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2679, Ch 0000001a
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000020
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000021
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000022
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000023
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000024
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000025
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7628, Ch 00000028
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7628, Ch 00000029
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7628, Ch 0000002a
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7771, Ch 00000030
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7771, Ch 00000031
fev 08 11:17:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7771, Ch 00000032
fev 08 11:17:25 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 31, pid=875, Ch 00000001, intr 00000000. MMU Fault: ENGINE HOST0 HUBCLIENT_ESC faulted @ 0x1_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
fev 08 11:17:27 gipsy-danger kernel: nvidia-modeset: ERROR: GPU:0: Failed to idle DMA.
fev 08 11:17:31 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 32, pid=2679, Channel ID 00000018 intr 80024000
fev 08 11:17:31 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 32, pid=2679, Channel ID 00000018 intr 00024000
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000000
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000001
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000009
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000011
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000012
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000013
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000014
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000015
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2679, Ch 00000018
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2679, Ch 00000019
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2679, Ch 0000001a
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000020
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000021
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000022
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000023
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000024
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=4681, Ch 00000025
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7628, Ch 00000029
fev 08 11:17:32 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=7628, Ch 0000002a
fev 08 11:17:33 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000000
fev 08 11:17:33 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000001
fev 08 11:17:33 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=875, Ch 00000009
fev 08 11:17:33 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000011
fev 08 11:17:33 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000012
fev 08 11:17:33 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000013
fev 08 11:17:33 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=2022, Ch 00000014

About gpu_burn I tried installing and running it through docker but couldn’t do it, it did not find my GPU. I installed nvidia-container-utils but still no luck.

Here is a picture of the screen. There are pink trace-like (with dots) artifacts all over the screen. The screen freezes at the moment the artifacts show up, a dozen seconds later the monitors might turn off as well.

I once managed to take a screenshot in this state and it did not had the same artifacts.

I really think the gpu is broken, especially since you have the same artifacts in the uefi screen.

Thank you, I’ll try to return it to the vendor.

Is there a possibility that this is HW related but not GPU related? Like memory, cpu or motherboard issues?

Unlikely, in those cases you’d see XID 32 errors, AER messages, XID 79 and spontaneous shutdowns/reboots. XID 62 is pretty limited to just the gpu.

I thought that the artifacts showed in the UEFI but it does not show, it does show just after the “Press F2 to enter UEFI” screen.

One thing that I noticed it’s that if I just reboot (without turning everything off and on) it boots already bugged with the artifacts and fail to detect GPU. But if I hard shutdown, it does not starts bugged, it does only after a while using the computer.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.