Random Xid 61 and Xorg lock-up

I can also provide ssh access to my machine… please DM me to arrange this.

Has anyone solved this issue yet?

This issue happens a lot less with Version 440.59 - I left some information about installing it on https://unix.stackexchange.com/a/566788/78904 - and today Software Updater told me about a new version 440.65 being ready to update. So keeping my fingers crossed.

Running 440.59. On a whim I decided to upgrade the one dell monitor I had giving me screen issues. These issues being that on reboot it would fail to show login screen, or on screen lock fail to wake up afterward. After the upgrade these issues went away. I’ve had 1 screen freeze since then and this is over ~19 days. So maybe these issues are also device related?

I was able to reproduce Xid 61 with only Opera running, and Trimps being the only tab open. It took about 11 hours after starting Opera to reproduce. However system uptime at that point was just under three days and I had run other programs earlier, so this is not a pure minimal repro. Driver version is 440.59 and I had a DVI monitor connected this time.

Ok so I think progress was made somewhere in one of the recent releases since I was able to recover (or this was just a fluke). Currently running kernel 5.5.8 and nvidia 440.64 and I had my machine freeze up again. This time though I was able to close chrome and recover. Checked logs and I now get multiple xid 61 errors and have stack traces related to libglx.so and libnvidia-glcore. So I think this bug is definitely related to hardware acceleration as we suspected. I could reopen chrome afterwards, but its a bit slow at first as I think a couple of my tabs try use some gl feature of the card and crash.

Mar 16 08:30:28  kernel: audit: type=1131 audit(1584372628.597:1951): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@5-4131663-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:28  audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@5-4131663-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:28  systemd[1]: systemd-coredump@5-4131663-0.service: Succeeded.
Mar 16 08:30:28  systemd-coredump[4131664]: Process 4131579 (chrome) of user 1000 dumped core.
                                                      
                                                      Stack trace of thread 4131580:
                                                      #0  0x0000564718111132 n/a (chrome + 0x5a39132)
                                                      #1  0x0000564716b9bb2a n/a (chrome + 0x44c3b2a)
                                                      #2  0x0000564716bacf01 n/a (chrome + 0x44d4f01)
                                                      #3  0x0000564716bacd07 n/a (chrome + 0x44d4d07)
                                                      #4  0x0000564716b66d78 n/a (chrome + 0x448ed78)
                                                      #5  0x0000564716bad938 n/a (chrome + 0x44d5938)
                                                      #6  0x0000564716b85ca5 n/a (chrome + 0x44adca5)
                                                      #7  0x0000564716bc1bae n/a (chrome + 0x44e9bae)
                                                      #8  0x0000564716bfd1b8 n/a (chrome + 0x45251b8)
                                                      #9  0x00007f78c82b946f start_thread (libpthread.so.0 + 0x946f)
                                                      #10 0x00007f78c6d873d3 __clone (libc.so.6 + 0xff3d3)
                                                      
                                                      Stack trace of thread 4131579:
                                                      #0  0x00007f78c6d7e2eb ioctl (libc.so.6 + 0xf62eb)
                                                      #1  0x00007f78c30f87bc n/a (libnvidia-glcore.so.440.64 + 0x12d77bc)
                                                      #2  0x00007f78c30f97a7 n/a (libnvidia-glcore.so.440.64 + 0x12d87a7)
                                                      #3  0x00007f78c30fab90 n/a (libnvidia-glcore.so.440.64 + 0x12d9b90)
                                                      #4  0x00007f78c2cf0130 n/a (libnvidia-glcore.so.440.64 + 0xecf130)
                                                      #5  0x00007f78c2cb7fb7 n/a (libnvidia-glcore.so.440.64 + 0xe96fb7)
                                                      #6  0x00007f78c2cb8651 n/a (libnvidia-glcore.so.440.64 + 0xe97651)
                                                      #7  0x00007f78c2ca4d52 n/a (libnvidia-glcore.so.440.64 + 0xe83d52)
                                                      #8  0x00007f78c2c5ae17 n/a (libnvidia-glcore.so.440.64 + 0xe39e17)
                                                      #9  0x00007f78c3de57a0 n/a (libGLX_nvidia.so.0 + 0x4d7a0)
                                                      #10 0x00007f78c3e13633 n/a (libGLX_nvidia.so.0 + 0x7b633)
                                                      #11 0x00007f78c40bec52 n/a (libGLX.so.0 + 0x14c52)
                                                      #12 0x00007f78c40c51ef n/a (libGLX.so.0 + 0x1b1ef)
                                                      #13 0x00007f78c40c5abd n/a (libGLX.so.0 + 0x1babd)
                                                      #14 0x0000564717a37aa0 n/a (chrome + 0x535faa0)
                                                      #15 0x0000564714a66c31 n/a (chrome + 0x238ec31)
                                                      #16 0x000056471810f350 n/a (chrome + 0x5a37350)
                                                      #17 0x000056471810ed8c n/a (chrome + 0x5a36d8c)
                                                      #18 0x000056471ab2f32c n/a (chrome + 0x845732c)
                                                      #19 0x0000564716741126 n/a (chrome + 0x4069126)
                                                      #20 0x000056471679212c n/a (chrome + 0x40ba12c)
                                                      #21 0x0000564716740371 n/a (chrome + 0x4068371)
                                                      #22 0x00005647144c261f ChromeMain (chrome + 0x1dea61f)
                                                      #23 0x00007f78c6caf023 __libc_start_main (libc.so.6 + 0x27023)
                                                      #24 0x00005647141ca9ea _start (chrome + 0x1af29ea)
Mar 16 08:30:28  kernel: audit: type=1130 audit(1584372628.327:1950): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@5-4131663-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:28  audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@5-4131663-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:28  systemd[1]: Started Process Core Dump (PID 4131663/UID 0).
Mar 16 08:30:26  kernel: audit: type=1701 audit(1584372626.223:1949): auid=1000 uid=1000 gid=1000 ses=2 pid=4131579 comm="GpuWatchdog" exe="/opt/google/chrome/chrome" sig=11 res=1
Mar 16 08:30:26  kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 0f 84 99 00 00 00 48 8d 3d 63 5f 4b fb be 01 00 00 00 ba 03 00 00 00 e8 be 17 a6 fe <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 4c 75 b9 03 01 80 7d 8f 00
Mar 16 08:30:26  kernel: GpuWatchdog[4131580]: segfault at 0 ip 0000564718111132 sp 00007f78c4a51600 error 6 in chrome[5647141ca000+7287000]
Mar 16 08:30:26  audit[4131579]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 pid=4131579 comm="GpuWatchdog" exe="/opt/google/chrome/chrome" sig=11 res=1
Mar 16 08:30:16  kernel: audit: type=1131 audit(1584372616.233:1948): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@4-4131576-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:16  audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@4-4131576-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:16  systemd[1]: systemd-coredump@4-4131576-0.service: Succeeded.
Mar 16 08:30:16  systemd-coredump[4131577]: Process 4131494 (chrome) of user 1000 dumped core.
                                                      
                                                      Stack trace of thread 4131495:
                                                      #0  0x000055f01e8d9132 n/a (chrome + 0x5a39132)
                                                      #1  0x000055f01d363b2a n/a (chrome + 0x44c3b2a)
                                                      #2  0x000055f01d374f01 n/a (chrome + 0x44d4f01)
                                                      #3  0x000055f01d374d07 n/a (chrome + 0x44d4d07)
                                                      #4  0x000055f01d32ed78 n/a (chrome + 0x448ed78)
                                                      #5  0x000055f01d375938 n/a (chrome + 0x44d5938)
                                                      #6  0x000055f01d34dca5 n/a (chrome + 0x44adca5)
                                                      #7  0x000055f01d389bae n/a (chrome + 0x44e9bae)
                                                      #8  0x000055f01d3c51b8 n/a (chrome + 0x45251b8)
                                                      #9  0x00007fdb5a7fd46f start_thread (libpthread.so.0 + 0x946f)
                                                      #10 0x00007fdb592cb3d3 __clone (libc.so.6 + 0xff3d3)
                                                      
                                                      Stack trace of thread 4131494:
                                                      #0  0x00007fdb592c22eb ioctl (libc.so.6 + 0xf62eb)
                                                      #1  0x00007fdb5563c7bc n/a (libnvidia-glcore.so.440.64 + 0x12d77bc)
                                                      #2  0x00007fdb5563d7a7 n/a (libnvidia-glcore.so.440.64 + 0x12d87a7)
                                                      #3  0x00007fdb5563eb90 n/a (libnvidia-glcore.so.440.64 + 0x12d9b90)
                                                      #4  0x00007fdb55234130 n/a (libnvidia-glcore.so.440.64 + 0xecf130)
                                                      #5  0x00007fdb551fbfb7 n/a (libnvidia-glcore.so.440.64 + 0xe96fb7)
                                                      #6  0x00007fdb551fc651 n/a (libnvidia-glcore.so.440.64 + 0xe97651)
                                                      #7  0x00007fdb551e8d52 n/a (libnvidia-glcore.so.440.64 + 0xe83d52)
                                                      #8  0x00007fdb5519ee17 n/a (libnvidia-glcore.so.440.64 + 0xe39e17)
                                                      #9  0x00007fdb563297a0 n/a (libGLX_nvidia.so.0 + 0x4d7a0)
                                                      #10 0x00007fdb56357633 n/a (libGLX_nvidia.so.0 + 0x7b633)
                                                      #11 0x00007fdb56602c52 n/a (libGLX.so.0 + 0x14c52)
                                                      #12 0x00007fdb566091ef n/a (libGLX.so.0 + 0x1b1ef)
                                                      #13 0x00007fdb56609abd n/a (libGLX.so.0 + 0x1babd)
                                                      #14 0x000055f01e1ffaa0 n/a (chrome + 0x535faa0)
                                                      #15 0x000055f01b22ec31 n/a (chrome + 0x238ec31)
                                                      #16 0x000055f01e8d7350 n/a (chrome + 0x5a37350)
                                                      #17 0x000055f01e8d6d8c n/a (chrome + 0x5a36d8c)
                                                      #18 0x000055f0212f732c n/a (chrome + 0x845732c)
                                                      #19 0x000055f01cf09126 n/a (chrome + 0x4069126)
                                                      #20 0x000055f01cf5a12c n/a (chrome + 0x40ba12c)
                                                      #21 0x000055f01cf08371 n/a (chrome + 0x4068371)
                                                      #22 0x000055f01ac8a61f ChromeMain (chrome + 0x1dea61f)
                                                      #23 0x00007fdb591f3023 __libc_start_main (libc.so.6 + 0x27023)
                                                      #24 0x000055f01a9929ea _start (chrome + 0x1af29ea)
Mar 16 08:30:15  kernel: audit: type=1130 audit(1584372615.963:1947): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@4-4131576-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:15  audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@4-4131576-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:15  systemd[1]: Started Process Core Dump (PID 4131576/UID 0).
Mar 16 08:30:13  audit[4131494]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 pid=4131494 comm="GpuWatchdog" exe="/opt/google/chrome/chrome" sig=11 res=1
Mar 16 08:30:13  kernel: audit: type=1701 audit(1584372613.897:1946): auid=1000 uid=1000 gid=1000 ses=2 pid=4131494 comm="GpuWatchdog" exe="/opt/google/chrome/chrome" sig=11 res=1
Mar 16 08:30:13  kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 0f 84 99 00 00 00 48 8d 3d 63 5f 4b fb be 01 00 00 00 ba 03 00 00 00 e8 be 17 a6 fe <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 4c 75 b9 03 01 80 7d 8f 00
Mar 16 08:30:13  kernel: GpuWatchdog[4131495]: segfault at 0 ip 000055f01e8d9132 sp 00007fdb56f95600 error 6 in chrome[55f01a992000+7287000]
Mar 16 08:30:03  kernel: audit: type=1131 audit(1584372603.906:1945): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@3-4131479-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:03  audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@3-4131479-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:03  systemd[1]: systemd-coredump@3-4131479-0.service: Succeeded.
Mar 16 08:30:03  systemd-coredump[4131480]: Process 4131018 (chrome) of user 1000 dumped core.
                                                      
                                                      Stack trace of thread 4131035:
                                                      #0  0x0000563c7e69c132 n/a (chrome + 0x5a39132)
                                                      #1  0x0000563c7d126b2a n/a (chrome + 0x44c3b2a)
                                                      #2  0x0000563c7d137f01 n/a (chrome + 0x44d4f01)
                                                      #3  0x0000563c7d137d07 n/a (chrome + 0x44d4d07)
                                                      #4  0x0000563c7d0f1d78 n/a (chrome + 0x448ed78)
                                                      #5  0x0000563c7d138938 n/a (chrome + 0x44d5938)
                                                      #6  0x0000563c7d110ca5 n/a (chrome + 0x44adca5)
                                                      #7  0x0000563c7d14cbae n/a (chrome + 0x44e9bae)
                                                      #8  0x0000563c7d1881b8 n/a (chrome + 0x45251b8)
                                                      #9  0x00007f614823e46f start_thread (libpthread.so.0 + 0x946f)
                                                      #10 0x00007f6146d0c3d3 __clone (libc.so.6 + 0xff3d3)
                                                      
                                                      Stack trace of thread 4131018:
                                                      #0  0x00007f6146d032eb ioctl (libc.so.6 + 0xf62eb)
                                                      #1  0x00007f614307d7bc n/a (libnvidia-glcore.so.440.64 + 0x12d77bc)
                                                      #2  0x00007f614307e7a7 n/a (libnvidia-glcore.so.440.64 + 0x12d87a7)
                                                      #3  0x00007f614307fb90 n/a (libnvidia-glcore.so.440.64 + 0x12d9b90)
                                                      #4  0x00007f6142c75130 n/a (libnvidia-glcore.so.440.64 + 0xecf130)
                                                      #5  0x00007f6142c3cfb7 n/a (libnvidia-glcore.so.440.64 + 0xe96fb7)
                                                      #6  0x00007f6142c3d651 n/a (libnvidia-glcore.so.440.64 + 0xe97651)
                                                      #7  0x00007f6142c29d52 n/a (libnvidia-glcore.so.440.64 + 0xe83d52)
                                                      #8  0x00007f6142bdfe17 n/a (libnvidia-glcore.so.440.64 + 0xe39e17)
                                                      #9  0x00007f6143d6a7a0 n/a (libGLX_nvidia.so.0 + 0x4d7a0)
                                                      #10 0x00007f6143d98633 n/a (libGLX_nvidia.so.0 + 0x7b633)
                                                      #11 0x00007f6144043c52 n/a (libGLX.so.0 + 0x14c52)
                                                      #12 0x00007f614404a1ef n/a (libGLX.so.0 + 0x1b1ef)
                                                      #13 0x00007f614404aabd n/a (libGLX.so.0 + 0x1babd)
                                                      #14 0x0000563c7dfc2aa0 n/a (chrome + 0x535faa0)
                                                      #15 0x0000563c7aff1c31 n/a (chrome + 0x238ec31)
                                                      #16 0x0000563c7e69a350 n/a (chrome + 0x5a37350)
                                                      #17 0x0000563c7e699d8c n/a (chrome + 0x5a36d8c)
                                                      #18 0x0000563c810ba32c n/a (chrome + 0x845732c)
                                                      #19 0x0000563c7cccc126 n/a (chrome + 0x4069126)
                                                      #20 0x0000563c7cd1d12c n/a (chrome + 0x40ba12c)
                                                      #21 0x0000563c7cccb371 n/a (chrome + 0x4068371)
                                                      #22 0x0000563c7aa4d61f ChromeMain (chrome + 0x1dea61f)
                                                      #23 0x00007f6146c34023 __libc_start_main (libc.so.6 + 0x27023)
                                                      #24 0x0000563c7a7559ea _start (chrome + 0x1af29ea)
Mar 16 08:30:03  kernel: audit: type=1130 audit(1584372603.636:1944): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@3-4131479-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:03  audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@3-4131479-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 16 08:30:03  systemd[1]: Started Process Core Dump (PID 4131479/UID 0).
Mar 16 08:30:01  kernel: audit: type=1701 audit(1584372601.573:1943): auid=1000 uid=1000 gid=1000 ses=2 pid=4131018 comm="GpuWatchdog" exe="/opt/google/chrome/chrome" sig=11 res=1
Mar 16 08:30:01  kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 0f 84 99 00 00 00 48 8d 3d 63 5f 4b fb be 01 00 00 00 ba 03 00 00 00 e8 be 17 a6 fe <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 4c 75 b9 03 01 80 7d 8f 00
Mar 16 08:30:01  kernel: GpuWatchdog[4131035]: segfault at 0 ip 0000563c7e69c132 sp 00007f61449d6600 error 6 in chrome[563c7a755000+7287000]
Mar 16 08:30:01  audit[4131018]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 pid=4131018 comm="GpuWatchdog" exe="/opt/google/chrome/chrome" sig=11 res=1
Mar 16 08:27:17  kernel: NVRM: Xid (PCI:0000:0b:00): 61, pid=365, 0cec(3098) 00000000 00000000
Mar 16 08:27:17  kernel: NVRM: GPU Board Serial Number:

We have 8 workstations at our office with the following specs:
ASUS PRIME X470-PRO
BIOS Version 5406
AMD Ryzen 7 3700X
2x NVIDIA RTX 2070 Super
Driver Version 440.64.00
Ubuntu 18.04.04 LTS

All 8 are running similar workloads. Out of the 8, only 2 have been affected by this Random Xid 61 and Xorg lock-up.
On the two affected system, the bug is randomly triggered (every 10-15 days) during execution of CUDA accelerated deep-learning applications (tensorflow-gpu and darknet-yolo).

No explanation as to why only 2 of out 8 identical systems running identical workloads suffer from this issue.

BTW, if it helps, I am able to provide SSH access to my affected machine next time the issue strikes.

Ran into the Xid 61 error today. Didn’t happen for a while.
I was able to run nvidia-bug-report.sh so I will include that.

Mar 23 15:42:48 galahad kernel: [18410.445092] NVRM: GPU at PCI:0000:09:00: GPU-8bc30071-cb5b-c7cf-af07-11706f852ea8
Mar 23 15:42:48 galahad kernel: [18410.445096] NVRM: GPU Board Serial Number:
Mar 23 15:42:48 galahad kernel: [18410.445102] NVRM: Xid (PCI:0000:09:00): 61, pid=2028, 0cec(3098) 00000000 00000000
nvidia-bug-report.loggz (530.7 KB)

Sorry, message came back as a reply, this was unintended.

Same here, Xid 61 with version 440.64; system kept being slow, nvidia-smi wasn’t able to read the fan speed any more.

nvidia-bug-report.loggz (3.2 MB)

And another one, 440.64, almost exactly 24 hours after the first.
nvidia-bug-report.loggz (1.8 MB)

I used to have this issue back in August-September 2019 on Manjaro Linux (testing branch at that time). Didn’t pay it too much attention since I wasn’t using the system too much. I was also using dual-boot Windows and same issues happened on Windows too. On Windows’ Power Management settings I have disabled Link State Power Management and the issue was gone. Haven’t encountered it since then. Anyway I have done a clean install to my system 10 days ago, installed Manjaro Testing again. I was playing games from the Dolphin emulator, Steam, Discord and Firefox were open at the background and after closing the game, few moments later everything froze, only a hard reboot solved the issue.

Just a note, I haven’t had this issue until today since I’ve installed Manjaro. Yesterday there was a kernel update and I have updated it but there were no problems until 2 hours ago either.

Mar 26 07:23:02 pepega kernel: NVRM: GPU at PCI:0000:09:00: GPU-16962672-8efa-3041-c86b-9443c9033077
Mar 26 07:23:02 pepega kernel: NVRM: GPU Board Serial Number: 
Mar 26 07:23:02 pepega kernel: NVRM: Xid (PCI:0000:09:00): 61, pid=1141, 0cec(3098) 00000000 00000000
Mar 26 07:23:07 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:08 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:08 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:09 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:11 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:11 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: WARNING: log rate limit exceeded (5 msgs per 30000ms). Discarding future messages.
Mar 26 07:23:37 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:37 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:43 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:43 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-433ms), your system is too slow
Mar 26 07:23:43 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-446ms), your system is too slow
Mar 26 07:23:44 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:46 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-192ms), your system is too slow
Mar 26 07:23:46 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-91ms), your system is too slow
Mar 26 07:23:46 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-104ms), your system is too slow
Mar 26 07:23:49 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-420ms), your system is too slow
Mar 26 07:23:49 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-252ms), your system is too slow
Mar 26 07:23:49 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-265ms), your system is too slow
Mar 26 07:23:51 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: SYN_DROPPED event - some input events have been lost.
Mar 26 07:23:51 pepega /usr/lib/gdm-x-session[1563]: (II) event2  - SteelSeries SteelSeries Rival 100 Gaming Mouse: WARNING: log rate limit exceeded (5 msgs per 30000ms). Discarding future messages.
Mar 26 07:23:55 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-516ms), your system is too slow
Mar 26 07:23:55 pepega /usr/lib/gdm-x-session[1563]: (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-529ms), your system is too slow

Specs:

  • ASUS TUF B450-PRO GAMING [Latest BIOS (2007)]
  • AMD Ryzen 5 3600 3.95 GHZ (OC)
  • NVIDIA RTX 2070 (Driver: 440.66.04)
  • 32 GB GSKILL 3600 MHZ
  • Manjaro Linux 5.4.27-1

Hello everybody,

this thread came to my attention since I’m having similar hardware and it is a relatively current thread. My system regularly almost freezes, CPU 0 reaching a load of 100% with a process called “irq/148-nvidia”. Mouse can be moved freely, programs take very long to start, so the system is mostly unusable, but I can still peform some basic tasks, especially to identify the error.

This error occurs most frequently after I put my system into standby mode and wake it up from standby again. However, it also sometimes occur without having used standby and after a short time of use, e.g. 5 minutes of uptime or so. Typically, I have a chromium-browser window open.

Today, I managed to perform a warm-restart that shut down the operating system. Before, it hang with a black screen and a cursor still blinking. This time I came as far as seeing the Aorus-Logo after reboot, but the system hang there, too. So the error seems to persist somehow after a warm-restart. A cold restart however worked.

The error seems to be X related. At my company we are doing Deep Learning with a similar system, but headless. There, the mentioned error did not occur yet, afaik.

This error is a showstopper for me and renders the system basically unusable. I hope there will be a fix soon.

Specs:

  • Gigabyte X570 I AORUS PRO WIFI (bios version F10)
  • AMD Ryzen 3700X
  • NVIDIA RTX 2070 SUPER (Gigabyte Windforce 8G, Driver: 440.64.00)
  • 64GB Corsair Vengeance XMP DDR4 PC 3200 CL16
  • Ubuntu 18.04.4 LTS

Update (2020-04-06): I did a BIOS update yesterday and did not yet spot this problem with with the new firmware version F12e.

I have ran into this error several times after 24-30hours of machine running.
апр 01 14:09:45 serzh kernel: NVRM: Xid (PCI:0000:08:00): 61, pid=1577, 0cec(3098) 00000000 00000000
апр 06 15:19:17 serzh kernel: NVRM: Xid (PCI:0000:08:00): 61, pid=1531, 0cec(3098) 00000000 00000000

Ubuntu 19.10
Ryzen 3900X
G.Skill Trident 32GB 3600Mhz
ASUS X570-I BIOS 1405
ASUS 2080 Super DUAL OC V2 (Driver 440.64)

This is really annoying!

Possible repro steps:

  1. Log in from the console
  2. Start a graphical session with startx
  3. Use Sawfish as window manager, with no compositor
  4. Launch opera-beta (version 67.0.3575.13 for me, but the exact version is likely not important)
  5. Open Trimps (https://trimps.github.io)
  6. Play the game (you can use the save data from https://pastebin.com/sn5Lucge to get to a stage where things are happening more quickly and require less babysitting)

With these steps I was able to reproduce the Xid in 1 day, 21 hours and 45 minutes of system uptime. It’s likely other Chromium-based browsers and other web pages work too, as long as they cause constant updates.

Same issue, a few hours of using chrome, vscode, and listening to music through an electron app.

  • Ubuntu 18.04, default window manager
  • Nvidia driver 435.21
  • Ryzen 7 3800X
  • RTX 2080 Ti
  • G.SKILL F4 DDR4 3600 C19 2x16GB
  • Gigabyte GA-X570 I AORUS PRO WIFI

NVRM: Xid (PCI:0000:0a:00): 61, pid=2226, 0cde(308c) 00000000 00000000
(pid corresponds to xorg)

I have also been having this exact same issue.

  • Ubuntu 19.10
  • nvidia driver 440.64
  • Ryzen 3900
  • Dual RTX 2080 Ti
  • G.SKILL F4 DDR4 3600 C19 4x16GB
  • ASUS x570-PRO

NVRM: Xid (PCI:0000:0a:00): 61, pid=1701, 0cde(308c) 00000000 00000000

Seems like there are a lot of people with this exact same issue. It grinds my workflow to a complete halt and requires a restart. Really interested in helping find a solution!

@amrits, is anyone at NVIDIA currently looking into this issue? It seems to be affecting a wide range of Ryzen + RTX combinations and is significantly disruptive to many of our workflows. If reproducibility on your end is the bottleneck some of us have already offered remote access to allow further investigation.
Would greatly appreciate some help from NVIDIA.

If it helps at all: for the past week, this has been a daily issue, requiring reboot every day.

This was happening to me on Ubuntu 19.10, I had to put the following parameter when booting the kernel via grub-customizer as a workaround:

processor.max_cstate=1

so the complete line stays:

linux /boot/vmlinuz root=/dev/mapper/ssd-ub1910 ro quiet splash $vt_handoff processor.max_cstate=1

I hope this works for you, let me know.

Best Regards.

Basdeth