Bug report: 455.23.04 - Kernel Panic due to NULL pointer dereference

Same kernel 5.4.80, driver 455.45.01:

[428463.273808] BUG: kernel NULL pointer dereference, address: 0000000000000020
[428463.273814] #PF: supervisor read access in kernel mode
[428463.273815] #PF: error_code(0x0000) - not-present page
[428463.273816] PGD 22063d067 P4D 22063d067 PUD 0 
[428463.273819] Oops: 0000 [#1] PREEMPT SMP NOPTI
[428463.273822] CPU: 0 PID: 4792 Comm: irq/69-nvidia Tainted: P           O      5.4.80-gentoo-r1-x86_64 #1
[428463.273823] Hardware name: System manufacturer System Product Name/M4A89TD PRO USB3, BIOS 3029    09/07/2012
[428463.274136] RIP: 0010:_nv027527rm+0x9/0x90 [nvidia]
[428463.274139] Code: 90 ff e8 ea b0 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
[428463.274141] RSP: 0018:ffffa5b6805f3be0 EFLAGS: 00010202
[428463.274142] RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
[428463.274143] RDX: ffff9626960e4b48 RSI: ffffffffffffffff RDI: 0000000000000020
[428463.274144] RBP: ffff962727d8b090 R08: ffffffffc1bdec70 R09: ffff962727d8b070
[428463.274145] R10: ffffffffc0828c20 R11: ffff962750e98808 R12: 0000000000000020
[428463.274146] R13: 0000000000000000 R14: ffff962727d8b1f8 R15: ffff962727d8b338
[428463.274147] FS:  0000000000000000(0000) GS:ffff962757a00000(0000) knlGS:0000000000000000
[428463.274148] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[428463.274149] CR2: 0000000000000020 CR3: 000000010f804000 CR4: 00000000000006f0
[428463.274150] Call Trace:
[428463.274377]  ? _nv029950rm+0x1b/0x90 [nvidia]
[428463.274524]  ? _nv025474rm+0x18/0x60 [nvidia]
[428463.274670]  ? _nv011691rm+0x13d/0x1c0 [nvidia]
[428463.274832]  ? _nv000083rm+0x12f/0x1a0 [nvidia]
[428463.275023]  ? _nv030071rm+0xb9/0x330 [nvidia]
[428463.275213]  ? _nv030070rm+0x61/0x80 [nvidia]
[428463.275404]  ? _nv030070rm+0x37/0x80 [nvidia]
[428463.275590]  ? _nv011623rm+0x428/0x460 [nvidia]
[428463.275753]  ? _nv024757rm+0x251/0x3e0 [nvidia]
[428463.275938]  ? _nv024705rm+0x1f/0xf0 [nvidia]
[428463.276122]  ? _nv015452rm+0xcb/0x370 [nvidia]
[428463.276263]  ? _nv026076rm+0x10/0x10 [nvidia]
[428463.276450]  ? _nv027734rm+0x273/0xdc0 [nvidia]
[428463.276637]  ? _nv007566rm+0x155/0x270 [nvidia]
[428463.276824]  ? _nv027742rm+0x8d/0x180 [nvidia]
[428463.276960]  ? _nv000712rm+0xa9/0x200 [nvidia]
[428463.276964]  ? irq_forced_thread_fn+0x70/0x70
[428463.277101]  ? rm_isr_bh+0x1c/0x60 [nvidia]
[428463.277233]  ? nvidia_isr_kthread_bh+0x16/0x4d0 [nvidia]
[428463.277235]  ? irq_thread_fn+0x1b/0x60
[428463.277237]  ? irq_thread+0xd7/0x160
[428463.277238]  ? wake_threads_waitq+0x30/0x30
[428463.277239]  ? irq_thread_dtor+0x80/0x80
[428463.277242]  ? kthread+0x125/0x150
[428463.277244]  ? kthread_create_worker_on_cpu+0x60/0x60
[428463.277247]  ? ret_from_fork+0x22/0x40
[428463.277248] Modules linked in: nvidia_uvm(O) cfg80211 nfnetlink_queue nfnetlink_log nfnetlink fuse rfcomm cmac algif_hash algif_skcipher af_alg bnep ipv6 btusb btrtl btbcm btintel uvcvideo bluetooth videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 ecdh_generic ch341 videobuf2_common rfkill ax88179_178a usbnet ecc videodev usbserial hid_logitech_hidpp nvidia_drm(PO) dm_mod hid_logitech_dj joydev hid_plantronics snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device mc usbhid nvidia_modeset(PO) snd_hda_codec_hdmi wmi_bmof amd64_edac_mod kvm_amd ccp kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio pcspkr k10temp nvidia(PO) i2c_piix4 snd_hda_intel snd_intel_nhlt snd_hda_codec ohci_pci ohci_hcd snd_hda_core snd_hwdep snd_pcm snd_timer firewire_ohci snd soundcore firewire_core r8168(O) ata_generic pata_acpi asus_atk0110 hwmon wmi acpi_cpufreq button xhci_pci nvme ehci_pci xhci_hcd ehci_hcd pata_jmicron ahci libahci nvme_core usbcore libata [last unloaded: cfg80211]
[428463.277279] CR2: 0000000000000020
[428463.277281] ---[ end trace f055a3d9ebf3233b ]---
[428463.277432] RIP: 0010:_nv027527rm+0x9/0x90 [nvidia]
[428463.277434] Code: 90 ff e8 ea b0 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
[428463.277435] RSP: 0018:ffffa5b6805f3be0 EFLAGS: 00010202
[428463.277436] RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
[428463.277437] RDX: ffff9626960e4b48 RSI: ffffffffffffffff RDI: 0000000000000020
[428463.277438] RBP: ffff962727d8b090 R08: ffffffffc1bdec70 R09: ffff962727d8b070
[428463.277439] R10: ffffffffc0828c20 R11: ffff962750e98808 R12: 0000000000000020
[428463.277440] R13: 0000000000000000 R14: ffff962727d8b1f8 R15: ffff962727d8b338
[428463.277441] FS:  0000000000000000(0000) GS:ffff962757a00000(0000) knlGS:0000000000000000
[428463.277442] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[428463.277443] CR2: 0000000000000020 CR3: 000000010f804000 CR4: 00000000000006f0
[428463.277474] BUG: kernel NULL pointer dereference, address: 0000000000000000
[428463.277476] #PF: supervisor instruction fetch in kernel mode
[428463.277477] #PF: error_code(0x0010) - not-present page
[428463.277477] PGD 22063d067 P4D 22063d067 PUD 0 
[428463.277480] Oops: 0010 [#2] PREEMPT SMP NOPTI
[428463.277481] CPU: 0 PID: 4792 Comm: irq/69-nvidia Tainted: P      D    O      5.4.80-gentoo-r1-x86_64 #1
[428463.277482] Hardware name: System manufacturer System Product Name/M4A89TD PRO USB3, BIOS 3029    09/07/2012
[428463.277484] RIP: 0010:0x0
[428463.277486] Code: Bad RIP value.
[428463.277487] RSP: 0018:ffffa5b6805f3e98 EFLAGS: 00010282
[428463.277488] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[428463.277489] RDX: ffffa5b6805f3ec8 RSI: 0000000000000000 RDI: ffffa5b6805f3ec8
[428463.277490] RBP: ffff962751391690 R08: ffff9627512cb4b0 R09: 0000000000000000
[428463.277491] R10: 0000000000000046 R11: ffffa5b6805f393e R12: ffff962751391000
[428463.277492] R13: ffffffff927698b0 R14: 0000000000000000 R15: ffff9627513916cc
[428463.277493] FS:  0000000000000000(0000) GS:ffff962757a00000(0000) knlGS:0000000000000000
[428463.277494] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[428463.277495] CR2: ffffffffffffffd6 CR3: 000000010f804000 CR4: 00000000000006f0
[428463.277496] Call Trace:
[428463.277498]  task_work_run+0x8e/0xb0
[428463.277501]  do_exit+0x34a/0xac0
[428463.277503]  ? irq_thread_dtor+0x80/0x80
[428463.277504]  ? kthread+0x125/0x150
[428463.277507]  rewind_stack_do_exit+0x17/0x20
[428463.277508] RIP: 0000:0x0
[428463.277510] Code: Bad RIP value.
[428463.277510] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
[428463.277512] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[428463.277512] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[428463.277513] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[428463.277514] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[428463.277515] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[428463.277516] Modules linked in: nvidia_uvm(O) cfg80211 nfnetlink_queue nfnetlink_log nfnetlink fuse rfcomm cmac algif_hash algif_skcipher af_alg bnep ipv6 btusb btrtl btbcm btintel uvcvideo bluetooth videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 ecdh_generic ch341 videobuf2_common rfkill ax88179_178a usbnet ecc videodev usbserial hid_logitech_hidpp nvidia_drm(PO) dm_mod hid_logitech_dj joydev hid_plantronics snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device mc usbhid nvidia_modeset(PO) snd_hda_codec_hdmi wmi_bmof amd64_edac_mod kvm_amd ccp kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio pcspkr k10temp nvidia(PO) i2c_piix4 snd_hda_intel snd_intel_nhlt snd_hda_codec ohci_pci ohci_hcd snd_hda_core snd_hwdep snd_pcm snd_timer firewire_ohci snd soundcore firewire_core r8168(O) ata_generic pata_acpi asus_atk0110 hwmon wmi acpi_cpufreq button xhci_pci nvme ehci_pci xhci_hcd ehci_hcd pata_jmicron ahci libahci nvme_core usbcore libata [last unloaded: cfg80211]
[428463.277536] CR2: 0000000000000000
[428463.277537] ---[ end trace f055a3d9ebf3233c ]---
[428463.277689] RIP: 0010:_nv027527rm+0x9/0x90 [nvidia]
[428463.277691] Code: 90 ff e8 ea b0 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
[428463.277692] RSP: 0018:ffffa5b6805f3be0 EFLAGS: 00010202
[428463.277693] RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
[428463.277694] RDX: ffff9626960e4b48 RSI: ffffffffffffffff RDI: 0000000000000020
[428463.277695] RBP: ffff962727d8b090 R08: ffffffffc1bdec70 R09: ffff962727d8b070
[428463.277696] R10: ffffffffc0828c20 R11: ffff962750e98808 R12: 0000000000000020
[428463.277697] R13: 0000000000000000 R14: ffff962727d8b1f8 R15: ffff962727d8b338
[428463.277698] FS:  0000000000000000(0000) GS:ffff962757a00000(0000) knlGS:0000000000000000
[428463.277699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[428463.277700] CR2: ffffffffffffffd6 CR3: 000000010f804000 CR4: 00000000000006f0
[428463.277701] Fixing recursive fault but reboot is needed!
[428580.436453] GpuWatchdog[12745]: segfault at 0 ip 000055c4e8323e87 sp 00007fb6c4f1a5b0 error 6 in skypeforlinux[55c4e5224000+53cb000]
[428580.436464] Code: 7d b7 00 79 09 48 8b 7d a0 e8 75 68 e1 fe 8b 83 00 01 00 00 85 c0 0f 84 91 00 00 00 48 8b 03 48 89 df be 01 00 00 00 ff 50 68 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 e7 dd 78 02 01 80 7d 87 00

Hi All, We have been not able to reproduce issue locally but after analyzing logs and reported incidents, we have probably root caused the issue and working towards fix. Please allow us some more time to debug it further and will get back with more updates

4 Likes

Same issue here, twice so far.

5.10.2-2-MANJARO #1 SMP PREEMPT Tue Dec 22 08:14:42 UTC 2020 x86_64 GNU/Linux
NVidia driver 455.45.01
Asrock X570 Taichi
NVidia 1080TI

Seemed to be triggered when using Chromium. Both times whilst browsing google drive.

1 Like

Got hit by this bug again yesterday on the 5.4 LTS kernel.

It’s really frustrating that nvidia-bug-report.sh hangs even with --safe-mode, which renders it useless.

Since you guys seem to be working on a fix already, please find out why the script hangs.

Hello.
Same issue here.

On the 455 series drivers this same error occurred.
The error occurs only in chromium based browsers.
Slackware current

NVIDIA-Linux-x86_64-460.32.03.run

Jan 13 01:45:47 slack-pc kernel: [21988.136139] BUG: kernel NULL pointer dereference, address: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.136146] #PF: supervisor read access in kernel mode
Jan 13 01:45:47 slack-pc kernel: [21988.136148] #PF: error_code(0x0000) - not-present page
Jan 13 01:45:47 slack-pc kernel: [21988.136155] Oops: 0000 [#1] PREEMPT SMP PTI
Jan 13 01:45:47 slack-pc kernel: [21988.136158] CPU: 0 PID: 1450 Comm: irq/32-nvidia Tainted: P           O      5.10.7 #1
Jan 13 01:45:47 slack-pc kernel: [21988.136160] Hardware name: POSITIVO POS-EIH61CE/POS-EIH61CE, BIOS 4.6.5 10/18/2012
Jan 13 01:45:47 slack-pc kernel: [21988.136427] RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.136431] Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 13 01:45:47 slack-pc kernel: [21988.136433] RSP: 0018:ffffab2440fa3bf0 EFLAGS: 00010202
Jan 13 01:45:47 slack-pc kernel: [21988.136436] RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 13 01:45:47 slack-pc kernel: [21988.136438] RDX: ffff90868cb87c48 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.136440] RBP: ffff9086655d2960 R08: ffffffffc2660b60 R09: ffff9086655d2940
Jan 13 01:45:47 slack-pc kernel: [21988.136441] R10: ffff9086655a4008 R11: ffff9086655a5098 R12: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.136443] R13: 0000000000000000 R14: ffff9086655d2ac8 R15: ffff9086655d2bd0
Jan 13 01:45:47 slack-pc kernel: [21988.136446] FS:  0000000000000000(0000) GS:ffff90894ec00000(0000) knlGS:0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.136448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 13 01:45:47 slack-pc kernel: [21988.136449] CR2: 0000000000000020 CR3: 000000012e330002 CR4: 00000000001706f0
Jan 13 01:45:47 slack-pc kernel: [21988.136450] Call Trace:
Jan 13 01:45:47 slack-pc kernel: [21988.136678]  ? _nv030766rm+0x1b/0x90 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.136868]  ? _nv026432rm+0x18/0x60 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.137031]  ? _nv012979rm+0x13d/0x1c0 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.137179]  ? _nv000081rm+0x12f/0x1a0 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.137384]  ? _nv012910rm+0xff/0x180 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.137599]  ? _nv019531rm+0x1af/0x210 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.137789]  ? _nv019482rm+0xdf3/0xef0 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.138002]  ? _nv019483rm+0xf3/0x290 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.138209]  ? _nv019449rm+0x78/0xd0 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.138404]  ? _nv019463rm+0xcf/0x2f0 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.138595]  ? _nv019497rm+0xbe/0xe0 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.138808]  ? _nv028705rm+0x97b/0xdc0 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.139032]  ? _nv028713rm+0x15d/0x400 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.139187]  ? _nv000709rm+0xa9/0x240 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.139192]  ? disable_irq_nosync+0x10/0x10
Jan 13 01:45:47 slack-pc kernel: [21988.139330]  ? rm_isr_bh+0x1c/0x60 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.139420]  ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.139422]  ? irq_thread_fn+0x20/0x60
Jan 13 01:45:47 slack-pc kernel: [21988.139423]  ? irq_thread+0xe3/0x190
Jan 13 01:45:47 slack-pc kernel: [21988.139425]  ? irq_finalize_oneshot.part.0+0xd0/0xd0
Jan 13 01:45:47 slack-pc kernel: [21988.139427]  ? irq_thread_check_affinity+0xa0/0xa0
Jan 13 01:45:47 slack-pc kernel: [21988.139429]  ? kthread+0x142/0x160
Jan 13 01:45:47 slack-pc kernel: [21988.139430]  ? __kthread_bind_mask+0x60/0x60
Jan 13 01:45:47 slack-pc kernel: [21988.139432]  ? ret_from_fork+0x22/0x30
Jan 13 01:45:47 slack-pc kernel: [21988.139434] Modules linked in: nvidia_uvm(PO) fuse lz4 zram nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables efivarfs it87 hwmon_vid nvidia_drm(PO) nvidia_modeset(PO) hid_generic usbhid hid nvidia(PO) snd_hda_codec_via snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation drm_kms_helper at24 intel_rapl_msr snd_soc_core regmap_i2c drm mei_hdcp snd_compress intel_rapl_common snd_pcm_dmaengine x86_pkg_temp_thermal intel_powerclamp agpgart fb_sys_fops coretemp syscopyarea soundwire_cadence gpio_ich snd_hda_codec snd_hda_core sysfillrect snd_hwdep snd_pcm kvm_intel sysimgblt snd_timer kvm snd i2c_i801 soundcore irqbypass evdev mei_me i2c_smbus ac97_bus crct10dif_pclmul crc32_pclmul mei
Jan 13 01:45:47 slack-pc kernel: [21988.139475]  ghash_clmulni_intel serio_raw i2c_core rapl bfq intel_cstate ehci_pci atl1c lpc_ich ehci_hcd thermal video fan button wmi loop
Jan 13 01:45:47 slack-pc kernel: [21988.139485] CR2: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.139488] ---[ end trace 0fbb305080b82bf1 ]---
Jan 13 01:45:47 slack-pc kernel: [21988.139638] RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.139640] Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 13 01:45:47 slack-pc kernel: [21988.139642] RSP: 0018:ffffab2440fa3bf0 EFLAGS: 00010202
Jan 13 01:45:47 slack-pc kernel: [21988.139643] RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 13 01:45:47 slack-pc kernel: [21988.139644] RDX: ffff90868cb87c48 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.139645] RBP: ffff9086655d2960 R08: ffffffffc2660b60 R09: ffff9086655d2940
Jan 13 01:45:47 slack-pc kernel: [21988.139646] R10: ffff9086655a4008 R11: ffff9086655a5098 R12: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.139647] R13: 0000000000000000 R14: ffff9086655d2ac8 R15: ffff9086655d2bd0
Jan 13 01:45:47 slack-pc kernel: [21988.139648] FS:  0000000000000000(0000) GS:ffff90894ec00000(0000) knlGS:0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139650] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 13 01:45:47 slack-pc kernel: [21988.139651] CR2: 0000000000000020 CR3: 000000012e330002 CR4: 00000000001706f0
Jan 13 01:45:47 slack-pc kernel: [21988.139675] BUG: kernel NULL pointer dereference, address: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139678] #PF: supervisor instruction fetch in kernel mode
Jan 13 01:45:47 slack-pc kernel: [21988.139679] #PF: error_code(0x0010) - not-present page
Jan 13 01:45:47 slack-pc kernel: [21988.139684] Oops: 0010 [#2] PREEMPT SMP PTI
Jan 13 01:45:47 slack-pc kernel: [21988.139687] CPU: 0 PID: 1450 Comm: irq/32-nvidia Tainted: P      D    O      5.10.7 #1
Jan 13 01:45:47 slack-pc kernel: [21988.139688] Hardware name: POSITIVO POS-EIH61CE/POS-EIH61CE, BIOS 4.6.5 10/18/2012
Jan 13 01:45:47 slack-pc kernel: [21988.139690] RIP: 0010:0x0
Jan 13 01:45:47 slack-pc kernel: [21988.139693] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
Jan 13 01:45:47 slack-pc kernel: [21988.139694] RSP: 0018:ffffab2440fa3ec0 EFLAGS: 00010286
Jan 13 01:45:47 slack-pc kernel: [21988.139720] RAX: 0000000000000000 RBX: ffff90864993ba00 RCX: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139721] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffab2440fa3ec8
Jan 13 01:45:47 slack-pc kernel: [21988.139722] RBP: ffff90864993ba00 R08: 0000000000000046 R09: ffffab2440fa38b0
Jan 13 01:45:47 slack-pc kernel: [21988.139724] R10: ffffab2440fa38a8 R11: ffffffffa1d37668 R12: ffff90864993c11c
Jan 13 01:45:47 slack-pc kernel: [21988.139725] R13: 0000000000000020 R14: 0000000000000001 R15: ffff90864993ba00
Jan 13 01:45:47 slack-pc kernel: [21988.139748] FS:  0000000000000000(0000) GS:ffff90894ec00000(0000) knlGS:0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139753] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 13 01:45:47 slack-pc kernel: [21988.139757] CR2: ffffffffffffffd6 CR3: 000000012e330002 CR4: 00000000001706f0
Jan 13 01:45:47 slack-pc kernel: [21988.139760] Call Trace:
Jan 13 01:45:47 slack-pc kernel: [21988.139767]  task_work_run+0x5c/0x90
Jan 13 01:45:47 slack-pc kernel: [21988.139774]  do_exit+0x333/0xa30
Jan 13 01:45:47 slack-pc kernel: [21988.139779]  ? irq_thread_check_affinity+0xa0/0xa0
Jan 13 01:45:47 slack-pc kernel: [21988.139780]  ? kthread+0x142/0x160
Jan 13 01:45:47 slack-pc kernel: [21988.139782]  rewind_stack_do_exit+0x17/0x17
Jan 13 01:45:47 slack-pc kernel: [21988.139784] RIP: 0000:0x0
Jan 13 01:45:47 slack-pc kernel: [21988.139785] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
Jan 13 01:45:47 slack-pc kernel: [21988.139786] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139788] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139789] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139790] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139791] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139791] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139793] Modules linked in: nvidia_uvm(PO) fuse lz4 zram nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables efivarfs it87 hwmon_vid nvidia_drm(PO) nvidia_modeset(PO) hid_generic usbhid hid nvidia(PO) snd_hda_codec_via snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation drm_kms_helper at24 intel_rapl_msr snd_soc_core regmap_i2c drm mei_hdcp snd_compress intel_rapl_common snd_pcm_dmaengine x86_pkg_temp_thermal intel_powerclamp agpgart fb_sys_fops coretemp syscopyarea soundwire_cadence gpio_ich snd_hda_codec snd_hda_core sysfillrect snd_hwdep snd_pcm kvm_intel sysimgblt snd_timer kvm snd i2c_i801 soundcore irqbypass evdev mei_me i2c_smbus ac97_bus crct10dif_pclmul crc32_pclmul mei
Jan 13 01:45:47 slack-pc kernel: [21988.139824]  ghash_clmulni_intel serio_raw i2c_core rapl bfq intel_cstate ehci_pci atl1c lpc_ich ehci_hcd thermal video fan button wmi loop
Jan 13 01:45:47 slack-pc kernel: [21988.139832] CR2: 0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.139834] ---[ end trace 0fbb305080b82bf2 ]---
Jan 13 01:45:47 slack-pc kernel: [21988.140007] RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 13 01:45:47 slack-pc kernel: [21988.140033] Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 13 01:45:47 slack-pc kernel: [21988.140036] RSP: 0018:ffffab2440fa3bf0 EFLAGS: 00010202
Jan 13 01:45:47 slack-pc kernel: [21988.140042] RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 13 01:45:47 slack-pc kernel: [21988.140046] RDX: ffff90868cb87c48 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.140049] RBP: ffff9086655d2960 R08: ffffffffc2660b60 R09: ffff9086655d2940
Jan 13 01:45:47 slack-pc kernel: [21988.140051] R10: ffff9086655a4008 R11: ffff9086655a5098 R12: 0000000000000020
Jan 13 01:45:47 slack-pc kernel: [21988.140052] R13: 0000000000000000 R14: ffff9086655d2ac8 R15: ffff9086655d2bd0
Jan 13 01:45:47 slack-pc kernel: [21988.140053] FS:  0000000000000000(0000) GS:ffff90894ec00000(0000) knlGS:0000000000000000
Jan 13 01:45:47 slack-pc kernel: [21988.140054] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 13 01:45:47 slack-pc kernel: [21988.140055] CR2: ffffffffffffffd6 CR3: 000000012e330002 CR4: 00000000001706f0
Jan 13 01:45:47 slack-pc kernel: [21988.140056] Fixing recursive fault but reboot is needed!

Thanks for listening.

@zezaocapoeira, did the bug occured on 455.xx or 460.32.03 driver version ?

Hello.

@al.piotrowicz

This error occurred in those versions I used.

-NVIDIA-Linux-x86_64-455.23.04.run
-NVIDIA-Linux-x86_64-455.28.run
-NVIDIA-Linux-x86_64-455.38.run
-NVIDIA-Linux-x86_64-455.45.01.run
-NVIDIA-Linux-x86_64-460.27.04.run
-NVIDIA-Linux-x86_64-460.32.03.run (currently installed version)

In my case, this error usually occurs in high uptimes 2 days + , when using chromium based browsers .

Thanks for listening.

Thanks for your fast reply @zezaocapoeira. That’s one of their 460.32.03 changelog subrecords:

Improved the memory allocation strategy in nvidia-modeset.ko to reduce the likelihood of out-of-memory errors, which typically manifest as “page allocation failure” messages in the kernel log.

I dunno its related to the issue.

For me the bug also triggers only when using a chrome based browsers.

Hello.

@al.piotrowicz

I checked the google-chrome-stable and vivaldi-3.5.2115.87.1 logs

Error regarding :

  • Skia shader compilation error

  • Program binary could not be loaded. Binary is not compatible with current driver/hardware combination

And both have the same error output:

...
Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8305:8305:0113/152317.135951:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------

Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8270:8270:0113/152317.148268:ERROR:CONSOLE(0)] "Unchecked runtime.lastError: The message port closed before a response was received.", source: chrome-extension://mpognobbkildjkofajifpdfhcoklimli/browser.html (0)
[8305:8305:0113/152317.391085:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------

Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8305:8305:0113/152317.401681:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------

Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8305:8305:0113/152317.413164:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------

Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8305:8305:0113/152317.424093:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------

Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8305:8305:0113/152317.429104:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------

Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8305:8305:0113/152317.434329:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------

Errors:
Program binary could not be loaded. Binary is not compatible with current driver/hardware combination. Driver build date Dec 27 2020. Please check build information of source that generated the binary.

[8305:8305:0113/152317.452881:ERROR:shared_context_state.cc(74)] Skia shader compilation error
------------------------
...

Thanks for listening.

@zezaocapoeira please try hwaccel in the browser by enabling the flags and add an exec switch to the launcher:

/usr/lib/chromium/chromium --use-gl=desktop (--use-gl=egl in the case of wayland)

FLAGS:

  • –enable-accelerated-video-decode
  • –enable-experimental-webassembly-features
  • –enable-gpu-rasterization
  • –enable-webgl-draft-extensions
  • –enable-webgl2-compute-context
  • –enable-zero-copy
  • –ignore-gpu-blocklist
  • –disable-smooth-scrolling

Also please try using the latest nvidia driver 460.32.03

@al.piotrowicz

With the flag --use-gl=desktop , there were no errors from the previous logs

$ vivaldi --use-gl=desktop --enable-accelerated-video-decode --enable-experimental-webassembly-features --enable-gpu-rasterization --enable-webgl-draft-extensions --enable-webgl2-compute-context --enable-zero-copy --ignore-gpu-blocklist --disable-smooth-scrolling

vivaldi

$ google-chrome --use-gl=desktop --enable-accelerated-video-decode --enable-experimental-webassembly-features --enable-gpu-rasterization --enable-webgl-draft-extensions --enable-webgl2-compute-context --enable-zero-copy --ignore-gpu-blocklist --disable-smooth-scrolling

chrome

Thanks for listening.

This still happens for me on 460.32.03-2 on kernel 5.10.6

Jan 13 13:10:49 scout kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020
Jan 13 13:10:49 scout kernel: #PF: supervisor read access in kernel mode
Jan 13 13:10:49 scout kernel: #PF: error_code(0x0000) - not-present page
Jan 13 13:10:49 scout kernel: PGD 80000001330fb067 P4D 80000001330fb067 PUD 0 
Jan 13 13:10:49 scout kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Jan 13 13:10:49 scout kernel: CPU: 4 PID: 605 Comm: irq/51-nvidia Tainted: P           OE     5.10.6-arch1-1 #1
Jan 13 13:10:49 scout kernel: Hardware name: System manufacturer System Product Name/MAXIMUS V GENE, BIOS 0701 03/29/2012
Jan 13 13:10:49 scout kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 13 13:10:49 scout kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 13 13:10:49 scout kernel: RSP: 0018:ffffb9e540a4bc20 EFLAGS: 00010202
Jan 13 13:10:49 scout kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 13 13:10:49 scout kernel: RDX: ffff9b00f9bcde08 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 13 13:10:49 scout kernel: RBP: ffff9b00dfa929f0 R08: ffffffffc2ac4b60 R09: ffff9b00dfa929d0
Jan 13 13:10:49 scout kernel: R10: ffff9b00cbd30008 R11: ffff9b00cbd31098 R12: 0000000000000020
Jan 13 13:10:49 scout kernel: R13: 0000000000000000 R14: ffff9b00dfa92b58 R15: ffff9b00dfa92c98
Jan 13 13:10:49 scout kernel: FS:  0000000000000000(0000) GS:ffff9b03ced00000(0000) knlGS:0000000000000000
Jan 13 13:10:49 scout kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 13 13:10:49 scout kernel: CR2: 0000000000000020 CR3: 00000001297f4001 CR4: 00000000001706e0
Jan 13 13:10:49 scout kernel: Call Trace:
Jan 13 13:10:49 scout kernel:  ? _nv030766rm+0x1b/0x90 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv026432rm+0x18/0x60 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv012979rm+0x13d/0x1c0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv000081rm+0x12f/0x1a0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv037801rm+0xc3/0x350 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv037800rm+0x63/0x80 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv012906rm+0x78/0xd0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv012906rm+0x1a/0xd0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv025575rm+0x251/0x3e0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv025524rm+0x1f/0xf0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv016719rm+0xd3/0x3c0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv028705rm+0xb23/0xdc0 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv028713rm+0x15d/0x400 [nvidia]
Jan 13 13:10:49 scout kernel:  ? _nv000709rm+0xa9/0x240 [nvidia]
Jan 13 13:10:49 scout kernel:  ? disable_irq_nosync+0x10/0x10
Jan 13 13:10:49 scout kernel:  ? rm_isr_bh+0x1c/0x60 [nvidia]
Jan 13 13:10:49 scout kernel:  ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Jan 13 13:10:49 scout kernel:  ? irq_thread_fn+0x20/0x60
Jan 13 13:10:49 scout kernel:  ? irq_thread+0xf5/0x1a0
Jan 13 13:10:49 scout kernel:  ? irq_finalize_oneshot.part.0+0xe0/0xe0
Jan 13 13:10:49 scout kernel:  ? irq_thread_check_affinity+0xd0/0xd0
Jan 13 13:10:49 scout kernel:  ? kthread+0x133/0x150
Jan 13 13:10:49 scout kernel:  ? __kthread_bind_mask+0x60/0x60
Jan 13 13:10:49 scout kernel:  ? ret_from_fork+0x22/0x30
Jan 13 13:10:49 scout kernel: Modules linked in: nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) nct6775 hwmon_vid snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr intel_rapl_common ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation x86_pkg_t>
Jan 13 13:10:49 scout kernel:  mousedev e1000e syscopyarea snd mei_me ecc sysfillrect lpc_ich soundcore mei sysimgblt fb_sys_fops wmi mac_hid video drm crypto_user fuse agpgart bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 crc32c_intel serio_raw usbhid xhci_pci xhci_pci_renesas
Jan 13 13:10:49 scout kernel: CR2: 0000000000000020
Jan 13 13:10:49 scout kernel: ---[ end trace aa3b68788dfd2c47 ]---
Jan 13 13:10:49 scout kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 13 13:10:49 scout kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 13 13:10:49 scout kernel: RSP: 0018:ffffb9e540a4bc20 EFLAGS: 00010202
Jan 13 13:10:49 scout kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 13 13:10:49 scout kernel: RDX: ffff9b00f9bcde08 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 13 13:10:49 scout kernel: RBP: ffff9b00dfa929f0 R08: ffffffffc2ac4b60 R09: ffff9b00dfa929d0
Jan 13 13:10:49 scout kernel: R10: ffff9b00cbd30008 R11: ffff9b00cbd31098 R12: 0000000000000020
Jan 13 13:10:49 scout kernel: R13: 0000000000000000 R14: ffff9b00dfa92b58 R15: ffff9b00dfa92c98
Jan 13 13:10:49 scout kernel: FS:  0000000000000000(0000) GS:ffff9b03ced00000(0000) knlGS:0000000000000000
Jan 13 13:10:49 scout kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 13 13:10:49 scout kernel: CR2: 0000000000000020 CR3: 00000001297f4001 CR4: 00000000001706e0
Jan 13 13:10:49 scout kernel: sched: RT throttling activated
Jan 13 13:10:49 scout kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020
Jan 13 13:10:49 scout kernel: #PF: supervisor read access in kernel mode
Jan 13 13:10:49 scout kernel: #PF: error_code(0x0000) - not-present page
Jan 13 13:10:49 scout kernel: PGD 80000001330fb067 P4D 80000001330fb067 PUD 0 

I got some application call stacks (opengl drive interaction, I assume) this time. Not sure if the apps crashed because of the kernel module fault (not sure about order of events).

This one shows libnvidia-glcore stack

Jan 13 09:38:28 scout systemd-coredump[1177]: Process 1135 (teams) of user 1000 dumped core.
                                              
                                              Stack trace of thread 1175:
                                              #0  0x000055949f4a0112 n/a (teams + 0x48c5112)
                                              #1  0x000055949f4a3fa6 n/a (teams + 0x48c8fa6)
                                              #2  0x00007f1b051f00f0 __restore_rt (libpthread.so.0 + 0x140f0)
                                              #3  0x00007f1b03bafc51 clock_nanosleep@@GLIBC_2.17 (libc.so.6 + 0xc7c51)
                                              #4  0x00007f1b03bb5137 __nanosleep (libc.so.6 + 0xcd137)
                                              #5  0x00007f1b03be0419 usleep (libc.so.6 + 0xf8419)
                                              #6  0x00007f1afae698be n/a (libnvidia-glcore.so.460.32.03 + 0xd708be)
                                              #7  0x00007f1afaea6f32 n/a (libnvidia-glcore.so.460.32.03 + 0xdadf32)
                                              #8  0x00007f1afaeaa37d n/a (libnvidia-glcore.so.460.32.03 + 0xdb137d)
                                              #9  0x00007f1b0135cf1f __glDispatchCheckMultithreaded (libGLdispatch.so.0 + 0x41f1f)
                                              #10 0x00007f1b0130416c glXGetFBConfigs (libGLX.so.0 + 0x1c16c)
                                              #11 0x000055949ef3a6c8 n/a (teams + 0x435f6c8)
                                              #12 0x000055949ef3b134 n/a (teams + 0x4360134)
                                              #13 0x000055949e4fc7f8 n/a (teams + 0x39217f8)
                                              #14 0x000055949e5178c0 n/a (teams + 0x393c8c0)
                                              #15 0x000055949e517e06 n/a (teams + 0x393ce06)
                                              #16 0x000055949e518003 n/a (teams + 0x393d003)
                                              #17 0x000055949e51891a n/a (teams + 0x393d91a)
                                              #18 0x000055949e533dd5 n/a (teams + 0x3958dd5)
                                              #19 0x000055949e56789f n/a (teams + 0x398c89f)
                                              #20 0x000055949e5a7537 n/a (teams + 0x39cc537)
                                              #21 0x00007f1b051e53e9 start_thread (libpthread.so.0 + 0x93e9)
                                              #22 0x00007f1b03be8293 __clone (libc.so.6 + 0x100293)
                                              
                                              Stack trace of thread 1146:
                                              #0  0x00007f1b03be85de epoll_wait (libc.so.6 + 0x1005de)
                                              #1  0x000055949e5b411a n/a (teams + 0x39d911a)
                                              #2  0x000055949e5b1ab3 n/a (teams + 0x39d6ab3)
                                              #3  0x000055949e5a5010 n/a (teams + 0x39ca010)
                                              #4  0x000055949e533dd5 n/a (teams + 0x3958dd5)
                                              #5  0x000055949e56789f n/a (teams + 0x398c89f)
                                              #6  0x000055949e5a7537 n/a (teams + 0x39cc537)
                                              #7  0x00007f1b051e53e9 start_thread (libpthread.so.0 + 0x93e9)
                                              #8  0x00007f1b03be8293 __clone (libc.so.6 + 0x100293)
                                              
                                              Stack trace of thread 1142:
                                              #0  0x00007f1b051eb9c8 pthread_cond_timedwait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf9c8)
                                              #1  0x000055949e59f082 n/a (teams + 0x39c4082)

This other ones:

Jul 12 19:50:53 scout systemd[10635]: pipewire.socket: Succeeded.
...skipping...
                                               #30 0x000055df44118af0 n/a (teams + 0x2f97af0)
                                               #31 0x000055df441188f8 _ZN2v88internal9Execution4CallEPNS0_7IsolateENS0_6HandleINS0_6ObjectEEES6_iPS6_ (teams + 0x2f978f8)
                                               #32 0x000055df444ee274 _ZN2v88Function4CallENS_5LocalINS_7ContextEEENS1_INS_5ValueEEEiPS5_ (teams + 0x336d274)
                                               #33 0x000055df478414e8 n/a (teams + 0x66c04e8)
                                               #34 0x000055df47841721 _ZN4node12MakeCallbackEPN2v87IsolateENS0_5LocalINS0_6ObjectEEENS3_INS0_8FunctionEEEiPNS3_INS0_5ValueEEENS_13async_contextE (teams + 0x66c0721)
                                               #35 0x000055df44a2577d n/a (teams + 0x38a477d)
                                               #36 0x000055df449a945b n/a (teams + 0x382845b)
                                               #37 0x000055df449c775c n/a (teams + 0x384675c)
                                               #38 0x000055df449bc1ba n/a (teams + 0x383b1ba)
                                               #39 0x000055df449b46e1 n/a (teams + 0x38336e1)
                                               #40 0x000055df449b458c n/a (teams + 0x383358c)
                                               #41 0x000055df449b42b5 n/a (teams + 0x38332b5)
                                               #42 0x000055df43b8e9d7 n/a (teams + 0x2a0d9d7)
                                               #43 0x000055df4393085d n/a (teams + 0x27af85d)
                                               #44 0x000055df4502e21b n/a (teams + 0x3ead21b)
                                               #45 0x000055df44aa27f8 n/a (teams + 0x39217f8)
                                               #46 0x000055df44abd8c0 n/a (teams + 0x393c8c0)
                                               #47 0x000055df44abde06 n/a (teams + 0x393ce06)
                                               #48 0x000055df44abe003 n/a (teams + 0x393d003)
                                               #49 0x000055df44ad332f n/a (teams + 0x395232f)
                                               #50 0x00007f3b3210ea84 g_main_context_dispatch (libglib-2.0.so.0 + 0x52a84)
                                               #51 0x00007f3b321629b1 n/a (libglib-2.0.so.0 + 0xa69b1)
                                               #52 0x00007f3b3210d2b1 g_main_context_iteration (libglib-2.0.so.0 + 0x512b1)
                                               #53 0x000055df44abec52 n/a (teams + 0x393dc52)
                                               #54 0x000055df44ad9dd5 n/a (teams + 0x3958dd5)
                                               #55 0x000055df43805d74 n/a (teams + 0x2684d74)
                                               #56 0x000055df43805b33 n/a (teams + 0x2684b33)
                                               #57 0x000055df43808312 n/a (teams + 0x2687312)
                                               #58 0x000055df43801cef n/a (teams + 0x2680cef)
                                               #59 0x000055df44943dec n/a (teams + 0x37c2dec)
                                               #60 0x000055df45c20135 n/a (teams + 0x4a9f135)
                                               #61 0x000055df44941eb1 n/a (teams + 0x37c0eb1)
                                               #62 0x000055df42f5f246 n/a (teams + 0x1dde246)
                                               #63 0x00007f3b30b87152 __libc_start_main (libc.so.6 + 0x28152)
Jan 13 11:06:08 scout systemd[1]: systemd-coredump@2-37850-0.service: Succeeded.

No bugreport.sh because this causes the system to hard crash, no only the hard reset button does anything at this point.

@andreesteve my only current workaround is to stick up to the LTS kernel and 440.100 nvidia driver. Not sure what caused your ‘teams’ crash (segfault 7 or abrt 6 signal?). Nonetheless Im not a code geek, but it’s something related to the clock_nanosleep kernel call what most probably leads to the teams crash in conjuction to recent driver implementation. The only way is to keep this thread running and wait for a fix.

Previously I was hit by page allocation failures, but those were resolved with the patch for 455 or in 460.
Now I’m getting this error.
Arch Linux, kernels 5.9 and 5.10.
Lenovo Thinkstation a few years old,
GTX 1660 Super, drivers 455.45.01 and 460.32.03.

Jan 16 11:34:04 mm-station kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020
Jan 16 11:34:04 mm-station kernel: #PF: supervisor read access in kernel mode
Jan 16 11:34:04 mm-station kernel: #PF: error_code(0x0000) - not-present page
Jan 16 11:34:04 mm-station kernel: PGD 80000002042c5067 P4D 80000002042c5067 PUD 52325c067 PMD 20a8cd067 PTE 0
Jan 16 11:34:04 mm-station kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Jan 16 11:34:04 mm-station kernel: CPU: 1 PID: 1164 Comm: irq/35-nvidia Tainted: P           OE     5.10.6-arch1-1 #1
Jan 16 11:34:04 mm-station kernel: Hardware name: LENOVO 30A0XXXXXX/SHARKBAY, BIOS FBKTDEAUS 06/16/2020
Jan 16 11:34:04 mm-station kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 16 11:34:04 mm-station kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 16 11:34:04 mm-station kernel: RSP: 0018:ffff9f1dc079bb60 EFLAGS: 00010202
Jan 16 11:34:04 mm-station kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 16 11:34:04 mm-station kernel: RDX: ffff8db2d52013c8 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 16 11:34:04 mm-station kernel: RBP: ffff8db10c2428c0 R08: ffff8db10c242b30 R09: ffff8db10c2428a0
Jan 16 11:34:04 mm-station kernel: R10: ffff8db10c25c008 R11: ffff8db10c25d098 R12: 0000000000000020
Jan 16 11:34:04 mm-station kernel: R13: 0000000000000000 R14: ffff8db10c242a28 R15: ffff8db10c242b30
Jan 16 11:34:04 mm-station kernel: FS:  0000000000000000(0000) GS:ffff8db7fdc40000(0000) knlGS:0000000000000000
Jan 16 11:34:04 mm-station kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 16 11:34:04 mm-station kernel: CR2: 0000000000000020 CR3: 000000027a26a001 CR4: 00000000001706e0
Jan 16 11:34:04 mm-station kernel: Call Trace:
Jan 16 11:34:04 mm-station kernel:  ? _nv030766rm+0x1b/0x90 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv026432rm+0x18/0x60 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv012979rm+0x13d/0x1c0 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv000081rm+0x12f/0x1a0 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv012910rm+0xff/0x180 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019531rm+0x1af/0x210 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019482rm+0xdf3/0xef0 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019483rm+0xf3/0x290 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019484rm+0x12f/0x350 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019485rm+0x1f5/0x320 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019449rm+0x78/0xd0 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019463rm+0xcf/0x2f0 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019464rm+0x35/0x540 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv019497rm+0xbe/0xe0 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv028705rm+0x97b/0xdc0 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv028713rm+0x15d/0x400 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? _nv000709rm+0xa9/0x240 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? disable_irq_nosync+0x10/0x10
Jan 16 11:34:04 mm-station kernel:  ? rm_isr_bh+0x1c/0x60 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Jan 16 11:34:04 mm-station kernel:  ? irq_thread_fn+0x20/0x60
Jan 16 11:34:04 mm-station kernel:  ? irq_thread+0xf5/0x1a0
Jan 16 11:34:04 mm-station kernel:  ? irq_finalize_oneshot.part.0+0xe0/0xe0
Jan 16 11:34:04 mm-station kernel:  ? irq_thread_check_affinity+0xd0/0xd0
Jan 16 11:34:04 mm-station kernel:  ? kthread+0x133/0x150
Jan 16 11:34:04 mm-station kernel:  ? __kthread_bind_mask+0x60/0x60
Jan 16 11:34:04 mm-station kernel:  ? ret_from_fork+0x22/0x30
Jan 16 11:34:04 mm-station kernel: Modules linked in: nvidia_uvm(POE) mei_hdcp mei_wdt 8021q garp mrp stp llc ccm rt2800usb rt2x00usb rt2800lib rt2x00lib mac80211 cfg80211 rfkill libarc4 mousedev nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) snd_hda_codec_realtek intel_rapl_msr intel_rapl_common snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_intel_dspcfg soundwire_intel coretemp xxhash_generic soundwire_generic_allocation kvm_intel soundwire_cadence snd_hda_codec ucsi_ccg iTCO_wdt snd_hda_core typec_ucsi intel_pmc_bxt typec kvm at24 wmi_bmof iTCO_vendor_support snd_hwdep soundwire_bus irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btrfs drm_kms_helper snd_soc_core aesni_intel crypto_simd snd_compress cryptd ac97_bus cec snd_pcm_dmaengine glue_helper snd_pcm rapl tpm_tis blake2b_generic intel_cstate xor syscopyarea tpm_tis_core mei_me raid6_pq snd_timer sysfillrect intel_uncore pcspkr snd i2c_i801 sysimgblt libcrc32c e1000e mei
Jan 16 11:34:04 mm-station kernel:  soundcore fb_sys_fops tpm i2c_nvidia_gpu i2c_smbus wmi lpc_ich rng_core mac_hid video vboxnetflt(OE) vboxnetadp(OE) drm vboxdrv(OE) fuse agpgart bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid uas usb_storage crc32c_intel sr_mod cdrom xhci_pci xhci_pci_renesas
Jan 16 11:34:04 mm-station kernel: CR2: 0000000000000020
Jan 16 11:34:04 mm-station kernel: ---[ end trace 3438ebc2238aedc5 ]---
Jan 16 11:34:04 mm-station kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 16 11:34:04 mm-station kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 16 11:34:04 mm-station kernel: RSP: 0018:ffff9f1dc079bb60 EFLAGS: 00010202
Jan 16 11:34:04 mm-station kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 16 11:34:04 mm-station kernel: RDX: ffff8db2d52013c8 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 16 11:34:04 mm-station kernel: RBP: ffff8db10c2428c0 R08: ffff8db10c242b30 R09: ffff8db10c2428a0
Jan 16 11:34:04 mm-station kernel: R10: ffff8db10c25c008 R11: ffff8db10c25d098 R12: 0000000000000020
Jan 16 11:34:04 mm-station kernel: R13: 0000000000000000 R14: ffff8db10c242a28 R15: ffff8db10c242b30
Jan 16 11:34:04 mm-station kernel: FS:  0000000000000000(0000) GS:ffff8db7fdc40000(0000) knlGS:0000000000000000
Jan 16 11:34:04 mm-station kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 16 11:34:04 mm-station kernel: CR2: 0000000000000020 CR3: 000000027a26a001 CR4: 00000000001706e0
Jan 16 11:34:04 mm-station kernel: BUG: kernel NULL pointer dereference, address: 0000000000000959
Jan 16 11:34:04 mm-station kernel: #PF: supervisor write access in kernel mode
Jan 16 11:34:04 mm-station kernel: #PF: error_code(0x0002) - not-present page
Jan 16 11:34:04 mm-station kernel: PGD 80000002042c5067 P4D 80000002042c5067 PUD 52325c067 PMD 20a8cd067 PTE 0
Jan 16 11:34:04 mm-station kernel: Oops: 0002 [#2] PREEMPT SMP PTI
Jan 16 11:34:04 mm-station kernel: CPU: 1 PID: 1164 Comm: irq/35-nvidia Tainted: P      D    OE     5.10.6-arch1-1 #1
Jan 16 11:34:04 mm-station kernel: Hardware name: LENOVO 30A0XXXXXX/SHARKBAY, BIOS FBKTDEAUS 06/16/2020
Jan 16 11:34:04 mm-station kernel: RIP: 0010:mutex_lock+0x10/0x20
Jan 16 11:34:04 mm-station kernel: Code: 03 31 c0 c3 eb d4 0f 1f 40 00 0f 1f 44 00 00 be 02 00 00 00 e9 a1 fa ff ff 90 0f 1f 44 00 00 31 c0 65 48 8b 14 25 c0 7b 01 00 <f0> 48 0f b1 17 75 01 c3 eb d6 66 0f 1f 44 00 00 0f 1f 44 00 00 41
Jan 16 11:34:04 mm-station kernel: RSP: 0018:ffff9f1dc079be30 EFLAGS: 00010246
Jan 16 11:34:04 mm-station kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
Jan 16 11:34:04 mm-station kernel: RDX: ffff8db1033bdc40 RSI: 0000000000001b41 RDI: 0000000000000959
Jan 16 11:34:04 mm-station kernel: RBP: 0000000000000959 R08: 0000000000000000 R09: ffff9f1dc079b7c0
Jan 16 11:34:04 mm-station kernel: R10: ffff9f1dc079b7b8 R11: ffffffff91ecb228 R12: ffff8db1033be434
Jan 16 11:34:04 mm-station kernel: R13: 0000000000000001 R14: 0000000000000001 R15: ffff8db1033bdc40
Jan 16 11:34:04 mm-station kernel: FS:  0000000000000000(0000) GS:ffff8db7fdc40000(0000) knlGS:0000000000000000
Jan 16 11:34:04 mm-station kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 16 11:34:04 mm-station kernel: CR2: 0000000000000959 CR3: 000000027a26a001 CR4: 00000000001706e0
Jan 16 11:34:04 mm-station kernel: Call Trace:
Jan 16 11:34:04 mm-station kernel:  perf_event_exit_task+0x30/0x440
Jan 16 11:34:04 mm-station kernel:  do_exit+0x355/0xa40
Jan 16 11:34:04 mm-station kernel:  ? task_work_run+0x5c/0x90
Jan 16 11:34:04 mm-station kernel:  ? do_exit+0x345/0xa40
Jan 16 11:34:04 mm-station kernel:  ? kthread+0x133/0x150
Jan 16 11:34:04 mm-station kernel:  ? rewind_stack_do_exit+0x17/0x17
Jan 16 11:34:04 mm-station kernel: Modules linked in: nvidia_uvm(POE) mei_hdcp mei_wdt 8021q garp mrp stp llc ccm rt2800usb rt2x00usb rt2800lib rt2x00lib mac80211 cfg80211 rfkill libarc4 mousedev nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) snd_hda_codec_realtek intel_rapl_msr intel_rapl_common snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_intel_dspcfg soundwire_intel coretemp xxhash_generic soundwire_generic_allocation kvm_intel soundwire_cadence snd_hda_codec ucsi_ccg iTCO_wdt snd_hda_core typec_ucsi intel_pmc_bxt typec kvm at24 wmi_bmof iTCO_vendor_support snd_hwdep soundwire_bus irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btrfs drm_kms_helper snd_soc_core aesni_intel crypto_simd snd_compress cryptd ac97_bus cec snd_pcm_dmaengine glue_helper snd_pcm rapl tpm_tis blake2b_generic intel_cstate xor syscopyarea tpm_tis_core mei_me raid6_pq snd_timer sysfillrect intel_uncore pcspkr snd i2c_i801 sysimgblt libcrc32c e1000e mei
Jan 16 11:34:04 mm-station kernel:  soundcore fb_sys_fops tpm i2c_nvidia_gpu i2c_smbus wmi lpc_ich rng_core mac_hid video vboxnetflt(OE) vboxnetadp(OE) drm vboxdrv(OE) fuse agpgart bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid uas usb_storage crc32c_intel sr_mod cdrom xhci_pci xhci_pci_renesas
Jan 16 11:34:04 mm-station kernel: CR2: 0000000000000959
Jan 16 11:34:04 mm-station kernel: ---[ end trace 3438ebc2238aedc6 ]---
Jan 16 11:34:04 mm-station kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
Jan 16 11:34:04 mm-station kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75 0e eb 2b 0f 1f 00 48 8b 52 10 48 85 d2
Jan 16 11:34:04 mm-station kernel: RSP: 0018:ffff9f1dc079bb60 EFLAGS: 00010202
Jan 16 11:34:04 mm-station kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
Jan 16 11:34:04 mm-station kernel: RDX: ffff8db2d52013c8 RSI: ffffffffffffffff RDI: 0000000000000020
Jan 16 11:34:04 mm-station kernel: RBP: ffff8db10c2428c0 R08: ffff8db10c242b30 R09: ffff8db10c2428a0
Jan 16 11:34:04 mm-station kernel: R10: ffff8db10c25c008 R11: ffff8db10c25d098 R12: 0000000000000020
Jan 16 11:34:04 mm-station kernel: R13: 0000000000000000 R14: ffff8db10c242a28 R15: ffff8db10c242b30
Jan 16 11:34:04 mm-station kernel: FS:  0000000000000000(0000) GS:ffff8db7fdc40000(0000) knlGS:0000000000000000
Jan 16 11:34:04 mm-station kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 16 11:34:04 mm-station kernel: CR2: 0000000000000959 CR3: 000000027a26a001 CR4: 00000000001706e0
Jan 16 11:34:04 mm-station kernel: Fixing recursive fault but reboot is needed!

I was tired of having nvidia-bug-report.sh hang on me even when running with --safe-mode, so I ran the script with strace to maybe find out why it hangs. And judging by the (incomplete) strace log file, the script will hang while trying to read /proc/driver/nvidia/./gpus/0000:01:00.0/power:

[pid  2028] openat(AT_FDCWD, "/proc/driver/nvidia/./gpus/0000:01:00.0/power", O_RDONLY) = 3
[pid  2028] fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
[pid  2028] fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
[pid  2028] mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f57d0e64000
[pid  2028] read(3, 

Here’s the command I used to capture the strace log (captured via SSH, because everything freezes and I’m unable to even switch to a TTY):

$ sudo strace -ff nvidia-bug-report.sh --safe-mode --extra-system-data 2>&1 | tee -a strace.log

And here’s the strace log itself: strace.log (614.8 KB)

And again, the driver crash happened while I was using Chromium, more specifically, watching a random Facebook video. This seems like the most random bug too, because I had literally just rebooted my computer, then I opened Chromium, watched the video for a minute and it crashed. So I tried to manually reproduce the crash again, repeating step-by-step, but I wasn’t able to!!!

The crash:

jan 17 05:20:08 arch kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020
jan 17 05:20:08 arch kernel: #PF: supervisor read access in kernel mode
jan 17 05:20:08 arch kernel: #PF: error_code(0x0000) - not-present page
jan 17 05:20:08 arch kernel: PGD 800000012c756067 P4D 800000012c756067 PUD 0 
jan 17 05:20:08 arch kernel: Oops: 0000 [#1] PREEMPT SMP PTI
jan 17 05:20:08 arch kernel: CPU: 2 PID: 215 Comm: irq/29-nvidia Tainted: P           OE     5.10.7-arch1-1 #1
jan 17 05:20:08 arch kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B75M-DGS R2.0, BIOS P1.50 03/14/2018
jan 17 05:20:08 arch kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
jan 17 05:20:08 arch kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75>
jan 17 05:20:08 arch kernel: RSP: 0018:ffff9fddc359bc20 EFLAGS: 00010202
jan 17 05:20:08 arch kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
jan 17 05:20:08 arch kernel: RDX: ffff89f868588908 RSI: ffffffffffffffff RDI: 0000000000000020
jan 17 05:20:08 arch kernel: RBP: ffff89f8129f5990 R08: ffffffffc2152b60 R09: ffff89f8129f5970
jan 17 05:20:08 arch kernel: R10: ffff89f812974008 R11: ffff89f812975098 R12: 0000000000000020
jan 17 05:20:08 arch kernel: R13: 0000000000000000 R14: ffff89f8129f5af8 R15: ffff89f8129f5c00
jan 17 05:20:08 arch kernel: FS:  0000000000000000(0000) GS:ffff89f915d00000(0000) knlGS:0000000000000000
jan 17 05:20:08 arch kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jan 17 05:20:08 arch kernel: CR2: 0000000000000020 CR3: 000000012b102004 CR4: 00000000001706e0
jan 17 05:20:08 arch kernel: Call Trace:
jan 17 05:20:08 arch kernel:  ? _nv030766rm+0x1b/0x90 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv026432rm+0x18/0x60 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv012979rm+0x13d/0x1c0 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv000081rm+0x12f/0x1a0 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv012910rm+0xff/0x180 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv019531rm+0x1af/0x210 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv019482rm+0xdf3/0xef0 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv019449rm+0x78/0xd0 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv019463rm+0xcf/0x2f0 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv019497rm+0xbe/0xe0 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv028705rm+0x97b/0xdc0 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv028713rm+0x15d/0x400 [nvidia]
jan 17 05:20:08 arch kernel:  ? _nv000709rm+0xa9/0x240 [nvidia]
jan 17 05:20:08 arch kernel:  ? disable_irq_nosync+0x10/0x10
jan 17 05:20:08 arch kernel:  ? rm_isr_bh+0x1c/0x60 [nvidia]
jan 17 05:20:08 arch kernel:  ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
jan 17 05:20:08 arch kernel:  ? irq_thread_fn+0x20/0x60
jan 17 05:20:08 arch kernel:  ? irq_thread+0xf5/0x1a0
jan 17 05:20:08 arch kernel:  ? irq_finalize_oneshot.part.0+0xe0/0xe0
jan 17 05:20:08 arch kernel:  ? irq_thread_check_affinity+0xd0/0xd0
jan 17 05:20:08 arch kernel:  ? kthread+0x133/0x150
jan 17 05:20:08 arch kernel:  ? __kthread_bind_mask+0x60/0x60
jan 17 05:20:08 arch kernel:  ? ret_from_fork+0x22/0x30
jan 17 05:20:08 arch kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr intel_rapl_common snd_hda_c>
jan 17 05:20:08 arch kernel:  xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6>
jan 17 05:20:08 arch kernel: CR2: 0000000000000020
jan 17 05:20:08 arch kernel: ---[ end trace 2771d77a04395ec1 ]---
jan 17 05:20:08 arch kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
jan 17 05:20:08 arch kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75>
jan 17 05:20:08 arch kernel: RSP: 0018:ffff9fddc359bc20 EFLAGS: 00010202
jan 17 05:20:08 arch kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
jan 17 05:20:08 arch kernel: RDX: ffff89f868588908 RSI: ffffffffffffffff RDI: 0000000000000020
jan 17 05:20:08 arch kernel: RBP: ffff89f8129f5990 R08: ffffffffc2152b60 R09: ffff89f8129f5970
jan 17 05:20:08 arch kernel: R10: ffff89f812974008 R11: ffff89f812975098 R12: 0000000000000020
jan 17 05:20:08 arch kernel: R13: 0000000000000000 R14: ffff89f8129f5af8 R15: ffff89f8129f5c00
jan 17 05:20:08 arch kernel: FS:  0000000000000000(0000) GS:ffff89f915d00000(0000) knlGS:0000000000000000
jan 17 05:20:08 arch kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jan 17 05:20:08 arch kernel: CR2: 0000000000000020 CR3: 000000012b102004 CR4: 00000000001706e0
jan 17 05:20:08 arch kernel: BUG: kernel NULL pointer dereference, address: 0000000000000959
jan 17 05:20:08 arch kernel: #PF: supervisor write access in kernel mode
jan 17 05:20:08 arch kernel: #PF: error_code(0x0002) - not-present page
jan 17 05:20:08 arch kernel: PGD 800000012c756067 P4D 800000012c756067 PUD 0 
jan 17 05:20:08 arch kernel: Oops: 0002 [#2] PREEMPT SMP PTI
jan 17 05:20:08 arch kernel: CPU: 2 PID: 215 Comm: irq/29-nvidia Tainted: P      D    OE     5.10.7-arch1-1 #1
jan 17 05:20:08 arch kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B75M-DGS R2.0, BIOS P1.50 03/14/2018
jan 17 05:20:08 arch kernel: RIP: 0010:mutex_lock+0x10/0x20
jan 17 05:20:08 arch kernel: Code: 03 31 c0 c3 eb d4 0f 1f 40 00 0f 1f 44 00 00 be 02 00 00 00 e9 a1 fa ff ff 90 0f 1f 44 00 00 31 c0 65 48 8b 14 25 c0 7b 01 00 <f0> 48 0f b1 17 75 01 c3 eb>
jan 17 05:20:08 arch kernel: RSP: 0018:ffff9fddc359be30 EFLAGS: 00010246
jan 17 05:20:08 arch kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
jan 17 05:20:08 arch kernel: RDX: ffff89f812e59ec0 RSI: 0000000000001b41 RDI: 0000000000000959
jan 17 05:20:08 arch kernel: RBP: 0000000000000959 R08: 0000000000000001 R09: 0000000000000000
jan 17 05:20:08 arch kernel: R10: ffff89f812a73c00 R11: 0000000000000000 R12: ffff89f812e5a6b4
jan 17 05:20:08 arch kernel: R13: 0000000000000001 R14: 0000000000000001 R15: ffff89f812e59ec0
jan 17 05:20:08 arch kernel: FS:  0000000000000000(0000) GS:ffff89f915d00000(0000) knlGS:0000000000000000
jan 17 05:20:08 arch kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jan 17 05:20:08 arch kernel: CR2: 0000000000000959 CR3: 000000012b102004 CR4: 00000000001706e0
jan 17 05:20:08 arch kernel: Call Trace:
jan 17 05:20:08 arch kernel:  perf_event_exit_task+0x30/0x440
jan 17 05:20:08 arch kernel:  ? kfree+0x40c/0x440
jan 17 05:20:08 arch kernel:  do_exit+0x355/0xa40
jan 17 05:20:08 arch kernel:  ? task_work_run+0x5c/0x90
jan 17 05:20:08 arch kernel:  ? do_exit+0x345/0xa40
jan 17 05:20:08 arch kernel:  ? kthread+0x133/0x150
jan 17 05:20:08 arch kernel:  ? rewind_stack_do_exit+0x17/0x17
jan 17 05:20:08 arch kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr intel_rapl_common snd_hda_c>
jan 17 05:20:08 arch kernel:  xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6>
jan 17 05:20:08 arch kernel: CR2: 0000000000000959
jan 17 05:20:08 arch kernel: ---[ end trace 2771d77a04395ec2 ]---
jan 17 05:20:08 arch kernel: RIP: 0010:_nv028498rm+0x9/0x90 [nvidia]
jan 17 05:20:08 arch kernel: Code: 8e ff e8 8a af 00 00 31 c0 48 83 c4 08 c3 31 c0 eb bf 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 85 ff 74 57 <48> 8b 17 31 c0 48 85 d2 75>
jan 17 05:20:08 arch kernel: RSP: 0018:ffff9fddc359bc20 EFLAGS: 00010202
jan 17 05:20:08 arch kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000010
jan 17 05:20:08 arch kernel: RDX: ffff89f868588908 RSI: ffffffffffffffff RDI: 0000000000000020
jan 17 05:20:08 arch kernel: RBP: ffff89f8129f5990 R08: ffffffffc2152b60 R09: ffff89f8129f5970
jan 17 05:20:08 arch kernel: R10: ffff89f812974008 R11: ffff89f812975098 R12: 0000000000000020
jan 17 05:20:08 arch kernel: R13: 0000000000000000 R14: ffff89f8129f5af8 R15: ffff89f8129f5c00
jan 17 05:20:08 arch kernel: FS:  0000000000000000(0000) GS:ffff89f915d00000(0000) knlGS:0000000000000000
jan 17 05:20:08 arch kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jan 17 05:20:08 arch kernel: CR2: 0000000000000959 CR3: 000000012b102004 CR4: 00000000001706e0
jan 17 05:20:08 arch kernel: Fixing recursive fault but reboot is needed!

My System Information

OS: Arch Linux
Kernel: Linux arch 5.10.7-arch1-1 #1 SMP PREEMPT Wed, 13 Jan 2021 12:02:01 +0000 x86_64 GNU/Linux
Kernel boot flags:

quiet splash loglevel=3 rd.systemd.show_status=auto rd.udev.log_priority=3 intel_pstate=passive nvidia-drm.modeset=1

GPU: NVIDIA GTX 660
Chromium: 87.0.4280.141
Desktop Environment: GNOME 3.38.3 (X11)
Window Manager: mutter 3.38.3
/etc/modprobe.d/nvidia.conf:

options nvidia NVreg_UsePageAttributeTable=1

MODULES in /etc/mkinitcpio.conf:

MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm)

~/.config/chromium-flags.conf: chromium-flags.conf (2.3 KB)