Couple of weeks ago I upgraded my Ryzen 2700X to Ryzen 3950X and am seeing regular freezes on my Ubuntu setup. I have 2 RTX 2080 TIs in the system, and 64GB RAM running on 2100Mhz using an X570 board.
Checking the logs reveals the following:
kernel: [82370.884942] NVRM: GPU at PCI:0000:09:00: GPU-5611c32e-db74-0e2b-6dfe-d46e8112e337
kernel: [82370.884947] NVRM: GPU Board Serial Number:
kernel: [82370.884955] NVRM: Xid (PCI:0000:09:00): 61, pid=1562, 0cec(3098) 00000000 00000000
kernel: [82395.164640] GpuWatchdog[6657]: segfault at 0 ip 000055e6d15f5ecd sp 00007fdd7cccf6d0 error 6 in chrome[55e6ccf49000+785a000]
kernel: [82395.164648] Code: 00 79 09 48 8b 7d b0 e8 b1 94 6c fe c7 45 b0 aa aa aa aa 0f ae f0 41 8b 84 24 e0 00 00 00 89 45 b0 48 8d 7d b0 e8 f3 59 ba fb 04 25 00 00 00 00 37 13 00 00 48 83 c4 38 5b 41 5c 41 5d 41 5e
kernel: [82395.192542] GpuWatchdog[6672]: segfault at 0 ip 0000560b81c25479 sp 00007fe9110c8680 error 6 in slack[560b7e2e0000+5caf000]
Can this be a ryzen bug? I ran kill-ryzen a bit but did not see any segfaults. Otherwise maybe it could be also RAM? They have been a bit flaky for me as I am running two different 32GB kits on lower Mhz.
The whole thing only started after getting the Ryzen 3950 though. It happens mostly in low compute situations, but I also observed segfaults while running cuda python code.