GTX 1650 SUPER hangs with 440.44 drivers

I’m getting GPU hangs / stutters on my Ubuntu 19.10 system with a GTX 1650 SUPER and 440.40 drivers. Sometimes the display freezes completey. Other times it freezes for some seconds, then seemingly recovers, then freezes again, and so on. This seems to happen randomly during normal use in GNOME 3.

My Xorg logs contain lots of:

(II) event9  - USB-HID Keyboard Mouse: SYN_DROPPED event - some input events have been lost.
(WW) NVIDIA: Wait for channel idle timed out.
(WW) NVIDIA: Wait for channel idle timed out.
(II) event9  - USB-HID Keyboard Mouse: SYN_DROPPED event - some input events have been lost.
(WW) NVIDIA: Wait for channel idle timed out.
(WW) NVIDIA(0): WAIT (2-S, 17, 0x3fb2, 0x00004d78, 0x00004dac)
(WW) NVIDIA(0): WAIT (1-S, 17, 0x3fb2, 0x00004d78, 0x00004dac)
(EE) NVIDIA(GPU-0): WAIT (2, 8, 0x8000, 0x00000238, 0x000002a8)
(EE) NVIDIA(GPU-0): WAIT (1, 8, 0x8000, 0x00000238, 0x000002a8)

Haven’t noticed anything interesting in dmesg. A hard reboot is required to recover.

The symptoms are similar to this post except I don’t use any special kernel module options (he has RMUseSwI2c=1 and claims it is the culprit).

Please help

Still happening with 440.59 drivers.

Your log is full of these messages:

These messages are related to USB, so I wonder if it’s related to the ‘event9’ message you quoted that happens right before the NVIDIA timeouts. I wonder if some USB device is flooding your system with interrupts and preventing the NVIDIA driver from servicing interrupts from the GPU.

Could you please try unplugging USB devices one at a time to see if there’s a particular one that is triggering this problem?

Thank you for looking into this.

I have identified and disconnected the USB device causing the “retire_capture_urb: 2491 callbacks suppressed” messages. Those messages are now gone from dmesg, but the problem persists.

Attached is a new nvidia-bug-report. The symptoms are identical, but the logs are a bit different. No “WAIT” messages in Xorg.log, but Xorg is at 100% CPU and appears to hang. The only thing updating on my display is the mouse pointer, and the system clock.

gdb shows that Xorg seems to be busy in nvidia_drv.so:

$ sudo gdb --pid 2480 -ex bt
GNU gdb (Ubuntu 8.3-0ubuntu1) 8.3
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 2480
[New LWP 2487]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f4fda76a690 in ?? () from /usr/lib/xorg/modules/drivers/nvidia_drv.so
#0  0x00007f4fda76a690 in  () at /usr/lib/xorg/modules/drivers/nvidia_drv.so
#1  0x00007f4fda76a79f in  () at /usr/lib/xorg/modules/drivers/nvidia_drv.so
#2  0x00007f4fda7598e4 in  () at /usr/lib/xorg/modules/drivers/nvidia_drv.so
#3  0x00007f4fdaba2ad0 in  () at /usr/lib/xorg/modules/drivers/nvidia_drv.so
#4  0x0000563e000008fc in  ()
#5  0x00007ffe6823c330 in  ()
#6  0x0000563ef219ada0 in  ()
#7  0x0000000000000001 in  ()
#8  0x0000563ef1550bb0 in  ()
#9  0x0000563ef1551bb0 in  ()
#10 0x0000563ef1300260 in  ()
#11 0x00007f4fdaba02b7 in  () at /usr/lib/xorg/modules/drivers/nvidia_drv.so
#12 0x0000563ef134baf0 in  ()
#13 0x000002400469cd45 in  ()
#14 0x00000900800260ba in  ()
#15 0x00000001f134baf0 in  ()
#16 0x0000000000000000 in  ()