Random kernel panics w GeForce GTX 1080 + 470.42.01 (Ubuntu 20.04)

Happens randomly, usually when playing videos (e.g. youtube, VLC, twitch). probability with 5.8 kernel is higher than with 5.4 kernel. Here’s how it ends:

Jul 15 20:38:47 pn kernel: [26324.775193] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
Jul 15 20:38:47 pn kernel: [26324.775202] rcu:    10-...0: (7 ticks this GP) idle=482/1/0x4000000000000000 softirq=1244121/1244122 fqs=7353 
Jul 15 20:38:47 pn kernel: [26324.775207]         (detected by 20, t=15002 jiffies, g=3923709, q=40232)
Jul 15 20:38:47 pn kernel: [26324.775210] Sending NMI from CPU 20 to CPUs 10:
Jul 15 20:38:47 pn kernel: [26324.776208] NMI watchdog: Watchdog detected hard LOCKUP on cpu 10
Jul 15 20:38:47 pn kernel: [26324.776208] Modules linked in: nvidia_uvm(OE) udf crc_itu_t nf_conntrack_netlink xfrm_user xfrm_algo xt_CHECKSUM xt_addrtype xt_MASQUERADE br_netfilter xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpu
dp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bpfilter bridge stp llc md4 cmac nls_utf8 cifs libarc
4 fscache libdes aufs overlay snd_hda_codec_hdmi binfmt_misc nls_iso8859_1 snd_hda_codec_realtek nvidia_drm(POE) nvidia_modeset(POE) snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwd
ep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi nvidia(POE) snd_seq snd_seq_device snd_timer edac_mce_amd drm_kms_helper snd eeepc_wmi joydev kvm_amd asus_wmi input_leds sparse_keymap fb_sys_fops syscopyarea video wmi_bmof sysf
illrect kvm sysimgblt soundcore ccp k10temp mac_hid sch_fq_codel nct6775 hwmon_vid parport_pc ppdev lp
Jul 15 20:38:47 pn kernel: [26324.776230]  drm parport ip_tables x_tables autofs4 dm_crypt hid_logitech_hidpp uas usb_storage hid_logitech_dj hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel c
rypto_simd cryptd glue_helper nvme ahci i2c_piix4 nvme_core r8125(OE) libahci wmi gpio_amdpt gpio_generic
Jul 15 20:38:47 pn kernel: [26324.776238] CPU: 10 PID: 2930 Comm: Xorg Tainted: P           OE     5.4.0-77-generic #86-Ubuntu
Jul 15 20:38:47 pn kernel: [26324.776238] Hardware name: ASUS System Product Name/TUF GAMING B550-PLUS, BIOS 1804 02/02/2021
Jul 15 20:38:47 pn kernel: [26324.776239] RIP: 0010:native_queued_spin_lock_slowpath+0x60/0x1d0
Jul 15 20:38:47 pn kernel: [26324.776240] Code: 6e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 48 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 5d 66 89 07 c3 8b 3
7 81 fe 00 01
Jul 15 20:38:47 pn kernel: [26324.776241] RSP: 0018:ffffad14c42dbda0 EFLAGS: 00000002
Jul 15 20:38:47 pn kernel: [26324.776242] RAX: 0000000000000101 RBX: 0000000000000052 RCX: 8000000000000007
Jul 15 20:38:47 pn kernel: [26324.776242] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f1f9017d498
Jul 15 20:38:47 pn kernel: [26324.776243] RBP: ffffad14c42dbda0 R08: 0000000000000010 R09: 000000000000001e
Jul 15 20:38:47 pn kernel: [26324.776244] R10: ffff9f1f3c941340 R11: 0000000000000000 R12: 0000000000000292
Jul 15 20:38:47 pn kernel: [26324.776244] R13: ffff9f1f9017d400 R14: ffff9f1fa79c5000 R15: ffff9f1f98665d00
Jul 15 20:38:47 pn kernel: [26324.776245] FS:  00007fb08ea48a40(0000) GS:ffff9f1faea80000(0000) knlGS:0000000000000000
Jul 15 20:38:47 pn kernel: [26324.776245] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 15 20:38:47 pn kernel: [26324.776246] CR2: 000025d1e05e9510 CR3: 0000000fd78a8000 CR4: 0000000000340ee0
Jul 15 20:38:47 pn kernel: [26324.776246] Call Trace:
Jul 15 20:38:47 pn kernel: [26324.776247]  _raw_spin_lock_irqsave+0x37/0x40
Jul 15 20:38:47 pn kernel: [26324.776247]  down+0x17/0x60
Jul 15 20:38:47 pn kernel: [26324.776247]  nvidia_ioctl+0xc0/0x880 [nvidia]
Jul 15 20:38:47 pn kernel: [26324.776248]  nvidia_frontend_unlocked_ioctl+0x3b/0x50 [nvidia]
Jul 15 20:38:47 pn kernel: [26324.776248]  do_vfs_ioctl+0x407/0x670
Jul 15 20:38:47 pn kernel: [26324.776249]  ? wait_woken+0x80/0x80
Jul 15 20:38:47 pn kernel: [26324.776249]  ksys_ioctl+0x67/0x90
Jul 15 20:38:47 pn kernel: [26324.776249]  __x64_sys_ioctl+0x1a/0x20Jul 15 20:38:47 pn kernel: [26324.776250]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul 15 20:38:47 pn kernel: [26324.776251] RIP: 0033:0x7fb08eda850b
Jul 15 20:38:47 pn kernel: [26324.776252] Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
Jul 15 20:38:47 pn kernel: [26324.776252] RSP: 002b:00007ffd28aac258 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Jul 15 20:38:47 pn kernel: [26324.776253] RAX: ffffffffffffffda RBX: 00007ffd28aac2e0 RCX: 00007fb08eda850b
Jul 15 20:38:47 pn kernel: [26324.776254] RDX: 00007ffd28aac2e0 RSI: 00000000c0104652 RDI: 000000000000001e
Jul 15 20:38:47 pn kernel: [26324.776255] RBP: 00000000c0104652 R08: 00007ffd28aac2e0 R09: 00007ffd28aac2ec
Jul 15 20:38:47 pn kernel: [26324.776255] R10: 0000000000000114 R11: 0000000000003246 R12: 0000000000000010
Jul 15 20:38:47 pn kernel: [26324.776256] R13: 0000000000000052 R14: 000000000000001e R15: 00007ffd28aac2ec
Jul 15 20:38:47 pn kernel: [26324.776256] NMI backtrace for cpu 10
Jul 15 20:38:47 pn kernel: [26324.776257] CPU: 10 PID: 2930 Comm: Xorg Tainted: P           OE     5.4.0-77-generic #86-Ubuntu
Jul 15 20:38:47 pn kernel: [26324.776258] Hardware name: ASUS System Product Name/TUF GAMING B550-PLUS, BIOS 1804 02/02/2021
Jul 15 20:38:47 pn kernel: [26324.776258] RIP: 0010:native_queued_spin_lock_slowpath+0x60/0x1d0
Jul 15 20:38:47 pn kernel: [26324.776260] Code: 6e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 48 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 5d 66 89 07 c3 8b 37 81 fe 00 01
Jul 15 20:38:47 pn kernel: [26324.776260] RSP: 0018:ffffad14c42dbda0 EFLAGS: 00000002
Jul 15 20:38:47 pn kernel: [26324.776261] RAX: 0000000000000101 RBX: 0000000000000052 RCX: 8000000000000007
Jul 15 20:38:47 pn kernel: [26324.776261] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f1f9017d498
Jul 15 20:38:47 pn kernel: [26324.776262] RBP: ffffad14c42dbda0 R08: 0000000000000010 R09: 000000000000001e
Jul 15 20:38:47 pn kernel: [26324.776263] R10: ffff9f1f3c941340 R11: 0000000000000000 R12: 0000000000000292
Jul 15 20:38:47 pn kernel: [26324.776263] R13: ffff9f1f9017d400 R14: ffff9f1fa79c5000 R15: ffff9f1f98665d00
Jul 15 20:38:47 pn kernel: [26324.776264] FS:  00007fb08ea48a40(0000) GS:ffff9f1faea80000(0000) knlGS:0000000000000000
Jul 15 20:38:47 pn kernel: [26324.776264] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 15 20:38:47 pn kernel: [26324.776265] CR2: 000025d1e05e9510 CR3: 0000000fd78a8000 CR4: 0000000000340ee0
Jul 15 20:38:47 pn kernel: [26324.776265] Call Trace:
Jul 15 20:38:47 pn kernel: [26324.776266]  _raw_spin_lock_irqsave+0x37/0x40
Jul 15 20:38:47 pn kernel: [26324.776266]  down+0x17/0x60
Jul 15 20:38:47 pn kernel: [26324.776266]  nvidia_ioctl+0xc0/0x880 [nvidia]
Jul 15 20:38:47 pn kernel: [26324.776267]  nvidia_frontend_unlocked_ioctl+0x3b/0x50 [nvidia]
Jul 15 20:38:47 pn kernel: [26324.776267]  do_vfs_ioctl+0x407/0x670
Jul 15 20:38:47 pn kernel: [26324.776267]  ? wait_woken+0x80/0x80
Jul 15 20:38:47 pn kernel: [26324.776268]  ksys_ioctl+0x67/0x90
Jul 15 20:38:47 pn kernel: [26324.776268]  __x64_sys_ioctl+0x1a/0x20
Jul 15 20:38:47 pn kernel: [26324.776269]  do_syscall_64+0x57/0x190
Jul 15 20:38:47 pn kernel: [26324.776269]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul 15 20:38:47 pn kernel: [26324.776269] RIP: 0033:0x7fb08eda850b
Jul 15 20:38:47 pn kernel: [26324.776271] Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
Jul 15 20:38:47 pn kernel: [26324.776271] RSP: 002b:00007ffd28aac258 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Jul 15 20:38:47 pn kernel: [26324.776272] RAX: ffffffffffffffda RBX: 00007ffd28aac2e0 RCX: 00007fb08eda850b
Jul 15 20:38:47 pn kernel: [26324.776273] RDX: 00007ffd28aac2e0 RSI: 00000000c0104652 RDI: 000000000000001e
Jul 15 20:38:47 pn kernel: [26324.776273] RBP: 00000000c0104652 R08: 00007ffd28aac2e0 R09: 00007ffd28aac2ec
Jul 15 20:38:47 pn kernel: [26324.776274] R10: 0000000000000114 R11: 0000000000003246 R12: 0000000000000010
Jul 15 20:38:47 pn kernel: [26324.776274] R13: 0000000000000052 R14: 000000000000001e R15: 00007ffd28aac2ec
Jul 15 20:38:47 pn kernel: [26324.776250]  do_syscall_64+0x57/0x190

edit: just tried it with kernel 5.12.15 and it still panics randomly.

nvidia-bug-report.log.gz (290.3 KB)