Xorg crash after GDM login

Hello.
My xorg is crashing when trying to log in using GDM. If I try it multiple times, it eventually works, but I need to try at least 3-4 times. It is also crashing when waking up from sleep.

okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE) Backtrace:
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE) 0: /usr/lib/Xorg (dri3_send_open_reply+0xdd) [0x562a220c74bd]
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE) 1: /usr/lib/libc.so.6 (__sigaction+0x50) [0x7f31c2e45710]
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE) unw_get_proc_name failed: no unwind info found [-10]
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE) 2: /usr/lib/nvidia/xorg/libglxserver_nvidia.so (?+0x0) [0x7f31bb4c81e5]
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE)
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE) Segmentation fault at address 0x98
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE)
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: Fatal server error:
okt 09 08:00:06 desktop /usr/lib/gdm-x-session[2793]: (EE) Caught signal 11 (Segmentation fault). Server aborting


[🡕] Process 2793 (Xorg) of user 1000 dumped core.                                             
Stack trace of thread 2793:
 #0  0x00007f31c2e9583c n/a (libc.so.6 + 0x8e83c)
 #1  0x00007f31c2e45668 raise (libc.so.6 + 0x3e668)
 #2  0x00007f31c2e2d4b8 abort (libc.so.6 + 0x264b8)
#3  0x0000562a220bf5a0 OsAbort (Xorg + 0x1535a0)
 #4  0x0000562a220c0a0b FatalError (Xorg + 0x154a0b)
 #5  0x0000562a220c7516 n/a (Xorg + 0x15b516)
#6  0x00007f31c2e45710 n/a (libc.so.6 + 0x3e710)
 #7  0x00007f31bb4c81e5 n/a (libglxserver_nvidia.so + 0x8c81e5)
 ELF object binary architecture: AMD x86-64
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.113.01             Driver Version: 535.113.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3060 Ti     Off | 00000000:10:00.0  On |                  N/A |
|  0%   48C    P8              27W / 200W |   1154MiB /  8192MiB |     26%      Default |
|                                         |                      |                  N/A |

Linux desktop 6.5.6-arch2-1 #1 SMP PREEMPT_DYNAMIC Sat, 07 Oct 2023 08:14:55 +0000 x86_64 GNU/Linux

@IvanV
Could you please help to share nvidia bug repro from repro state for triage purpose.

nvidia-bug-report.log.gz (396.4 KB)
Here it is.

Hey @amrits . Any update? Anything else needed from my side? Thanks

Can you please try running memtest86+ on this system? Looking at your bug report log I see a variety of crashes in different places.

Hey @aplattner . I did 2 full runs of Memtest 86 without an issue. But maybe I found the root cause. My CPU ryzen 5500 supports PCIe gen3 only, but my bios was set to gen4. I changed it to gen3 and so far no issues. I’ll report a few days later

It’s still crashing. Sometimes even when starting discord, opening browser, during a game. The only thing that helps is switching to Wayland. I even tried to turn off XMP.

I did an extensive benchmark of CPU, RAM and GPU, no crash. It only occurs when I’m, trying to run something. Sometimes it’s slack, discord, steam, VM. I had also a few crashes when I wanted to click on a video on youtube.

I tried to swap a CPU (5500 to 3100) and different RAM. Still the same issue. I created a fresh user with clean gnome, same issue.

I haven’t found a solution, but I expect it to be gnome related bug. When running Gnome on Wayland, all works perfectly. I tried KDE on Xorg and no crashes. Even gnome classic seems fine, so not sure where exactly should I report this and if it’s xorg, gnome or nvidia problem.

Hey @amrits @aplattner . I know you are probably not checking this anymore, but I managed to fix Xorg crash issue by removing nvidia-drm.modeset=1 from Grub cmdline, just in case it might be helpful. And if you wonder, I had nvidia modules in my initramfs, MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm).