Sometimes it can take a week to happen, but sometimes it happens all day.
The computer completely freezes with pink artifacts all over the screens. When I reboot, sometimes it shows the same artifacts even in the UEFI Boot.
Here i ran a grep looking for NVRM today.
fev 08 07:45:51 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 07:45:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=861, 0000(0000) 00000000 00000000
fev 08 07:45:55 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=803, Ch 00000000
fev 08 07:45:55 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=803, Ch 00000001
fev 08 07:45:55 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=454, Ch 00000002
-----------------------------
fev 08 07:46:19 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:00:37 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:00:37 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=911, 0000(0000) 00000000 00000000
-----------------------------
fev 08 08:01:06 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:01:34 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:01:34 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=931, 0000(0000) 00000000 00000000
-----------------------------
fev 08 08:02:22 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:03:07 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:03:07 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=941, 0000(0000) 00000000 00000000
-----------------------------
fev 08 08:03:38 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:04:30 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:05:22 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:05:22 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=950, 0000(0000) 00000000 00000000
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: Shader Program Header 1 Error
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: Shader Program Header 2 Error
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: Shader Program Header 9 Error
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: ESR 0x405840=0x82000206
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=950, Graphics Exception: ESR 0x405848=0x80000000
fev 08 08:05:23 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 13, pid=2092, Graphics Exception: ChID 0018, Class 0000c797, Offset 00000100, Data 00000000
fev 08 08:05:25 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 31, pid=2092, Ch 00000018, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_6 faulted @ 0x1_0400a000. Fault is of type FAULT_UNSUPPORTED_APERTURE ACCESS_TYPE_VIRT_READ
-----------------------------
fev 08 08:05:49 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:08:34 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:08:34 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=912, 0000(0000) 00000000 00000000
fev 08 08:08:42 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 8, pid=476, Channel 00000002
fev 08 08:08:44 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 31, pid=854, Ch 00000001, intr 00000000. MMU Fault: ENGINE HOST0 HUBCLIENT_ESC faulted @ 0x1_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000000
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000001
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=476, Ch 00000002
fev 08 08:08:47 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000008
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000009
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000018
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000019
fev 08 08:08:49 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 0000001a
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000000
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000001
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=476, Ch 00000002
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000008
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000009
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000018
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000019
fev 08 08:08:50 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 0000001a
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000000
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000001
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=476, Ch 00000002
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000008
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=854, Ch 00000009
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000018
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 00000019
fev 08 08:08:51 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 45, pid=3129, Ch 0000001a
-----------------------------
fev 08 08:09:26 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:09:31 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x25:0x65:1451)
fev 08 08:09:31 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x25:0xffff:1451)
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:32 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:33 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x23:0xffff:1401)
fev 08 08:09:34 gipsy-danger kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
-----------------------------
fev 08 08:10:32 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:14:31 gipsy-danger kernel: NVRM: GPU at PCI:0000:07:00: GPU-5c835339-45df-c2fe-a127-cafd24e37f17
fev 08 08:14:31 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 44, pid=1799, Ch 00000010, intr 00000000
fev 08 08:14:31 gipsy-danger kernel: NVRM: Xid (PCI:0000:07:00): 62, pid=1799, 0000(0000) 00000000 00000000
fev 08 08:15:00 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
fev 08 08:15:52 gipsy-danger kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 510.47.03 Mon Jan 24 22:58:54 UTC 2022
Sometimes after the Xid I get a “RT sched throttling” as well or “nvidia-modeset: ERROR: GPU:0: Failed to idle DMA.”
I already ran memtest86 and cuda_memtest on this machine with no problem.