asm86
January 27, 2021, 9:04pm
144
This is still not fixed in 460.39, I experienced this crash within minutes of watching hardware accelerated video in Kodi, having just installed the latest driver earlier today.
[12161.387225] BUG: kernel NULL pointer dereference, address: 0000000000000008
[12161.387237] #PF: supervisor read access in kernel mode
[12161.387243] #PF: error_code(0x0000) - not-present page
[12161.387248] PGD 800000010ad54067 P4D 800000010ad54067 PUD 10ad53067 PMD 0
[12161.387264] Oops: 0000 [#1 ] PREEMPT SMP PTI
[12161.387273] CPU: 1 PID: 629 Comm: irq/155-nvidia Tainted: P O 5.10.11-asm #1
[12161.387278] Hardware name: System manufacturer System Product Name/ROG STRIX Z370-F GAMING, BIOS 2401 07/12/2019
[12161.388302] RIP: 0010:_nv013013rm+0xd8/0x130 [nvidia]
[12161.388315] Code: 40 00 31 c0 5b 41 5c 41 5d c3 0f 1f 84 00 00 00 00 00 48 c7 46 38 00 00 00 00 48 89 f7 45 31 ed e8 dd 7a ff ff eb 9b 0f 1f 00 <49> 8b 7c 24 08 e8 de 29 00 00 48 85 c0 74 b4 49 83 7c 24 08 00 74
[12161.388321] RSP: 0000:ffff98b8c0b1bc40 EFLAGS: 00010246
[12161.388330] RAX: 0000000000000001 RBX: ffff949f639fab88 RCX: 0000000000000010
[12161.388335] RDX: ffff94a08cc709d8 RSI: 00000000004789f2 RDI: ffff94a0cd3e5b70
[12161.388341] RBP: ffff949f639fab10 R08: ffff949f639fabd0 R09: ffff949f639faa60
[12161.388346] R10: ffff949f49ba0008 R11: ffff949f49ba1098 R12: 0000000000000000
[12161.388351] R13: 0000000000000001 R14: 00000000beef0003 R15: ffff949f639fabd0
[12161.388357] FS: 0000000000000000(0000) GS:ffff94aa56a40000(0000) knlGS:0000000000000000
[12161.388363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12161.388368] CR2: 0000000000000008 CR3: 000000010ad50003 CR4: 00000000003706e0
[12161.388374] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[12161.388379] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[12161.388383] Call Trace:
[12161.389367] ? _nv000082rm+0x16c/0x1e0 [nvidia]
[12161.390602] ? _nv012946rm+0xff/0x180 [nvidia]
[12161.391782] ? _nv019582rm+0x1af/0x210 [nvidia]
[12161.392970] ? _nv019533rm+0xdf2/0xef0 [nvidia]
[12161.394131] ? _nv019534rm+0xf3/0x290 [nvidia]
[12161.395281] ? _nv019500rm+0x78/0xd0 [nvidia]
[12161.396423] ? _nv019514rm+0xcf/0x2f0 [nvidia]
[12161.397580] ? _nv019548rm+0xbe/0xe0 [nvidia]
[12161.398775] ? _nv028760rm+0x97b/0xdc0 [nvidia]
[12161.399977] ? _nv028768rm+0x15d/0x400 [nvidia]
[12161.400779] ? _nv000710rm+0xa9/0x240 [nvidia]
[12161.400792] ? disable_irq_nosync+0x10/0x10
[12161.401580] ? rm_isr_bh+0x1c/0x60 [nvidia]
[12161.402292] ? nvidia_isr_kthread_bh+0x16/0x30 [nvidia]
[12161.402303] ? irq_thread_fn+0x1b/0x60
[12161.402310] ? irq_thread+0xde/0x180
[12161.402317] ? irq_finalize_oneshot.part.0+0xe0/0xe0
[12161.402326] ? irq_thread_check_affinity+0xa0/0xa0
[12161.402333] ? kthread+0x129/0x150
[12161.402339] ? __kthread_bind_mask+0x60/0x60
[12161.402347] ? ret_from_fork+0x22/0x30
[12161.402353] Modules linked in: udp_diag tcp_diag inet_diag rfcomm cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel bluetooth rfkill input_leds ecdh_generic mousedev ecc nvidia_drm(PO) nvidia_modeset(PO) xt_conntrack xt_tcpudp iptable_nat nf_nat nvidia(PO) iptable_filter xt_LOG nf_conntrack nf_defrag_ipv4 libcrc32c intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_realtek snd_hda_codec_generic kvm ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_core aesni_intel drm_kms_helper snd_pcm crypto_simd cryptd glue_helper snd_timer rapl snd syscopyarea sysfillrect intel_cstate sysimgblt e1000e intel_uncore soundcore fb_sys_fops bridge stp llc evdev drm sg i2c_core ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid sr_mod cdrom sd_mod ahci crc32c_intel libahci libata xhci_pci scsi_mod xhci_hcd tun vfio_pci
[12161.402530] irqbypass vfio_virqfd vfio_iommu_type1 vfio ntfs fuse
[12161.402551] CR2: 0000000000000008
[12161.402558] â[ end trace 2e0b4e403e9b9fad ]â
[12161.403454] RIP: 0010:_nv013013rm+0xd8/0x130 [nvidia]
[12161.403465] Code: 40 00 31 c0 5b 41 5c 41 5d c3 0f 1f 84 00 00 00 00 00 48 c7 46 38 00 00 00 00 48 89 f7 45 31 ed e8 dd 7a ff ff eb 9b 0f 1f 00 <49> 8b 7c 24 08 e8 de 29 00 00 48 85 c0 74 b4 49 83 7c 24 08 00 74
[12161.403470] RSP: 0000:ffff98b8c0b1bc40 EFLAGS: 00010246
[12161.403478] RAX: 0000000000000001 RBX: ffff949f639fab88 RCX: 0000000000000010
[12161.403484] RDX: ffff94a08cc709d8 RSI: 00000000004789f2 RDI: ffff94a0cd3e5b70
[12161.403488] RBP: ffff949f639fab10 R08: ffff949f639fabd0 R09: ffff949f639faa60
[12161.403492] R10: ffff949f49ba0008 R11: ffff949f49ba1098 R12: 0000000000000000
[12161.403496] R13: 0000000000000001 R14: 00000000beef0003 R15: ffff949f639fabd0
[12161.403504] FS: 0000000000000000(0000) GS:ffff94aa56a40000(0000) knlGS:0000000000000000
[12161.403508] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12161.403513] CR2: 0000000000000008 CR3: 000000010ad50003 CR4: 00000000003706e0
[12161.403517] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[12161.403523] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[12161.403572] BUG: kernel NULL pointer dereference, address: 0000000000000009
[12161.403578] #PF: supervisor instruction fetch in kernel mode
[12161.403581] #PF: error_code(0x0010) - not-present page
[12161.403587] PGD 800000010ad54067 P4D 800000010ad54067 PUD 10ad53067 PMD 0
[12161.403600] Oops: 0010 [#2 ] PREEMPT SMP PTI
[12161.403608] CPU: 1 PID: 629 Comm: irq/155-nvidia Tainted: P D O 5.10.11-asm #1
[12161.403613] Hardware name: System manufacturer System Product Name/ROG STRIX Z370-F GAMING, BIOS 2401 07/12/2019
[12161.403618] RIP: 0010:0x9
[12161.403626] Code: Unable to access opcode bytes at RIP 0xffffffffffffffdf.
[12161.403631] RSP: 0000:ffff98b8c0b1bec8 EFLAGS: 00010286
[12161.403637] RAX: 0000000000000009 RBX: ffffffff9a66c1f8 RCX: 0000000000000000
[12161.403641] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98b8c0b1bec8
[12161.403645] RBP: ffff949f41099e00 R08: 0000000000000000 R09: ffff98b8c0b1b920
[12161.403651] R10: ffff98b8c0b1b918 R11: ffffffff9b4a4168 R12: ffff949f4109a53c
[12161.403654] R13: 0000000000000008 R14: 0000000000000000 R15: 0000000000000000
[12161.403659] FS: 0000000000000000(0000) GS:ffff94aa56a40000(0000) knlGS:0000000000000000
[12161.403664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12161.403669] CR2: ffffffffffffffdf CR3: 000000010ad50003 CR4: 00000000003706e0
[12161.403673] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[12161.403677] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[12161.403681] Call Trace:
[12161.403690] ? task_work_run+0x57/0x80
[12161.403698] ? do_exit+0x2fe/0x980
[12161.403704] ? kthread+0x129/0x150
[12161.403711] ? rewind_stack_do_exit+0x17/0x17
[12161.403717] Modules linked in: udp_diag tcp_diag inet_diag rfcomm cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel bluetooth rfkill input_leds ecdh_generic mousedev ecc nvidia_drm(PO) nvidia_modeset(PO) xt_conntrack xt_tcpudp iptable_nat nf_nat nvidia(PO) iptable_filter xt_LOG nf_conntrack nf_defrag_ipv4 libcrc32c intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_realtek snd_hda_codec_generic kvm ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_core aesni_intel drm_kms_helper snd_pcm crypto_simd cryptd glue_helper snd_timer rapl snd syscopyarea sysfillrect intel_cstate sysimgblt e1000e intel_uncore soundcore fb_sys_fops bridge stp llc evdev drm sg i2c_core ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid sr_mod cdrom sd_mod ahci crc32c_intel libahci libata xhci_pci scsi_mod xhci_hcd tun vfio_pci
[12161.403875] irqbypass vfio_virqfd vfio_iommu_type1 vfio ntfs fuse
[12161.403893] CR2: 0000000000000009
[12161.403898] â[ end trace 2e0b4e403e9b9fae ]â
[12161.404785] RIP: 0010:_nv013013rm+0xd8/0x130 [nvidia]
[12161.404792] Code: 40 00 31 c0 5b 41 5c 41 5d c3 0f 1f 84 00 00 00 00 00 48 c7 46 38 00 00 00 00 48 89 f7 45 31 ed e8 dd 7a ff ff eb 9b 0f 1f 00 <49> 8b 7c 24 08 e8 de 29 00 00 48 85 c0 74 b4 49 83 7c 24 08 00 74
[12161.404798] RSP: 0000:ffff98b8c0b1bc40 EFLAGS: 00010246
[12161.404804] RAX: 0000000000000001 RBX: ffff949f639fab88 RCX: 0000000000000010
[12161.404808] RDX: ffff94a08cc709d8 RSI: 00000000004789f2 RDI: ffff94a0cd3e5b70
[12161.404813] RBP: ffff949f639fab10 R08: ffff949f639fabd0 R09: ffff949f639faa60
[12161.404818] R10: ffff949f49ba0008 R11: ffff949f49ba1098 R12: 0000000000000000
[12161.404822] R13: 0000000000000001 R14: 00000000beef0003 R15: ffff949f639fabd0
[12161.404827] FS: 0000000000000000(0000) GS:ffff94aa56a40000(0000) knlGS:0000000000000000
[12161.404832] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12161.404837] CR2: ffffffffffffffdf CR3: 000000010ad50003 CR4: 00000000003706e0
[12161.404841] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[12161.404845] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[12161.404849] Fixing recursive fault but reboot is needed!
Will have to go back now to kernel 5.4 and 440 series yet again as it seems the only stable driver. Is this ever going to get fixed?
New bug report attached. Hopefully it can help this time.
nvidia-bug-report.log.gz (66.2 KB)
1 Like
harrisl
January 27, 2021, 10:02pm
145
I have been running the new driver with 5.10.11 for 3 hours without a crash. I never survived more that a hour with any driver after 4.50.80. The new drivers may have fixed this issue for me.
Just had it happen again on 5.10.10 (arch-linux) with driver 460.39
1 Like
VaporD
January 28, 2021, 5:53pm
147
For some reason, I havenât had this happen in months. I did add the following to my ~/.config/chrome-flags.conf
to try to solve a different issue. I donât know if maybe it also helped with this issue or if it was something else but I thought I would throw it out there since nothing else is working and maybe someone would like to try:
--use-cmd-decoder=validating --use-gl=desktop
This has been an issue for almost 5 months at this point. Can we potentially get a fix?
abelits
January 29, 2021, 2:37am
149
Is 460.39 supposed to include this fix?
Still not fixed on 460.39 and kernel 5.10.11. Iâm running Arch Linux and was watching a video in VLC.
Jan 29 12:58:02 GamingPC kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Jan 29 12:58:02 GamingPC kernel: #PF: supervisor read access in kernel mode
Jan 29 12:58:02 GamingPC kernel: #PF: error_code(0x0000) - not-present page
Jan 29 12:58:04 GamingPC kernel: PGD 80000002dd4f1067 P4D 80000002dd4f1067 PUD 0
Jan 29 12:58:04 GamingPC kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Jan 29 12:58:04 GamingPC kernel: CPU: 0 PID: 611 Comm: irq/143-nvidia Tainted: P OE 5.10.11-zen2-1-zen #1
Jan 29 12:58:04 GamingPC kernel: Hardware name: System manufacturer System Product Name/Z170I PRO GAMING, BIOS 3805 05/16/2018
Jan 29 12:58:04 GamingPC kernel: RIP: 0010:_nv013013rm+0xd8/0x130 [nvidia]
Jan 29 12:58:04 GamingPC kernel: Code: 40 00 31 c0 5b 41 5c 41 5d c3 0f 1f 84 00 00 00 00 00 48 c7 46 38 00 00 00 00 48 89 f7 45 31 ed e8 dd 7a ff ff eb 9b 0f 1f 00 <49> 8b 7c 24 08 e8 de 29 00 00 48 85 c0 74 b4 49 83 7c 24 08 00 74
Jan 29 12:58:04 GamingPC kernel: RSP: 0018:ffffabc341c13c48 EFLAGS: 00010246
Jan 29 12:58:04 GamingPC kernel: RAX: 0000000000000001 RBX: ffff99fe21215c18 RCX: 0000000000000010
Jan 29 12:58:04 GamingPC kernel: RDX: ffff99fe22d7cb58 RSI: 00000000004789f2 RDI: ffff9a01244e8ff0
Jan 29 12:58:04 GamingPC kernel: RBP: ffff99fe21215ba0 R08: ffffffffc2f10960 R09: ffff99fe21215af0
Jan 29 12:58:04 GamingPC kernel: R10: ffff99fe266a0008 R11: ffff99fe266a1098 R12: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: R13: 0000000000000001 R14: 00000000beef0003 R15: ffff99fe21215c98
Jan 29 12:58:04 GamingPC kernel: FS: 0000000000000000(0000) GS:ffff9a0126c00000(0000) knlGS:0000000000000000
Jan 29 12:58:04 GamingPC kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 29 12:58:04 GamingPC kernel: CR2: 0000000000000008 CR3: 00000002f9dc0006 CR4: 00000000003726f0
Jan 29 12:58:04 GamingPC kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 29 12:58:04 GamingPC kernel: Call Trace:
Jan 29 12:58:04 GamingPC kernel: ? _nv000082rm+0x16c/0x1e0 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv037896rm+0xc3/0x350 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv037895rm+0x63/0x80 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv012942rm+0x78/0xd0 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv012942rm+0x1a/0xd0 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv025630rm+0x251/0x3e0 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv025579rm+0x1f/0xf0 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv016770rm+0xd3/0x3c0 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv028760rm+0xb23/0xdc0 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv028768rm+0x15d/0x400 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? _nv000710rm+0xa9/0x240 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? irq_forced_thread_fn+0x80/0x80
Jan 29 12:58:04 GamingPC kernel: ? rm_isr_bh+0x1c/0x60 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
Jan 29 12:58:04 GamingPC kernel: ? irq_thread_fn+0x20/0x60
Jan 29 12:58:04 GamingPC kernel: ? irq_thread+0x206/0x2b0
Jan 29 12:58:04 GamingPC kernel: ? irq_thread_fn+0x60/0x60
Jan 29 12:58:04 GamingPC kernel: ? irq_set_irq_wake+0x1b0/0x1b0
Jan 29 12:58:04 GamingPC kernel: ? kthread+0x181/0x1b0
Jan 29 12:58:04 GamingPC kernel: ? __kthread_init_worker+0x50/0x50
Jan 29 12:58:04 GamingPC kernel: ? ret_from_fork+0x22/0x30
Jan 29 12:58:04 GamingPC kernel: Modules linked in: macvtap macvlan vhost_net vhost vhost_iotlb tap tun snd_seq_dummy snd_hrtimer snd_seq snd_seq_device xt_CHECKSUM xt_MASQUERADE nvidia_drm(POE) nvidia_modeset(POE) ip6table_mangle ip6tab>
Jan 29 12:58:04 GamingPC kernel: sparse_keymap mxm_wmi crypto_simd cryptd snd_timer glue_helper rapl mei_me intel_cstate i2c_i801 intel_uncore cec pcspkr snd rfkill e1000e i2c_smbus syscopyarea nf_log_ipv6 sysfillrect sysimgblt ip6t_REJ>
Jan 29 12:58:04 GamingPC kernel: CR2: 0000000000000008
Jan 29 12:58:04 GamingPC kernel: ---[ end trace 3f17b3d029e863a8 ]---
Jan 29 12:58:04 GamingPC kernel: RIP: 0010:_nv013013rm+0xd8/0x130 [nvidia]
Jan 29 12:58:04 GamingPC kernel: Code: 40 00 31 c0 5b 41 5c 41 5d c3 0f 1f 84 00 00 00 00 00 48 c7 46 38 00 00 00 00 48 89 f7 45 31 ed e8 dd 7a ff ff eb 9b 0f 1f 00 <49> 8b 7c 24 08 e8 de 29 00 00 48 85 c0 74 b4 49 83 7c 24 08 00 74
Jan 29 12:58:04 GamingPC kernel: RSP: 0018:ffffabc341c13c48 EFLAGS: 00010246
Jan 29 12:58:04 GamingPC kernel: RAX: 0000000000000001 RBX: ffff99fe21215c18 RCX: 0000000000000010
Jan 29 12:58:04 GamingPC kernel: RDX: ffff99fe22d7cb58 RSI: 00000000004789f2 RDI: ffff9a01244e8ff0
Jan 29 12:58:04 GamingPC kernel: RBP: ffff99fe21215ba0 R08: ffffffffc2f10960 R09: ffff99fe21215af0
Jan 29 12:58:04 GamingPC kernel: R10: ffff99fe266a0008 R11: ffff99fe266a1098 R12: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: R13: 0000000000000001 R14: 00000000beef0003 R15: ffff99fe21215c98
Jan 29 12:58:04 GamingPC kernel: FS: 0000000000000000(0000) GS:ffff9a0126c00000(0000) knlGS:0000000000000000
Jan 29 12:58:04 GamingPC kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 29 12:58:04 GamingPC kernel: CR2: 0000000000000008 CR3: 00000002f9dc0006 CR4: 00000000003726f0
Jan 29 12:58:04 GamingPC kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 29 12:58:04 GamingPC kernel: BUG: unable to handle page fault for address: ffffffffa04c9c34
Jan 29 12:58:04 GamingPC kernel: #PF: supervisor write access in kernel mode
Jan 29 12:58:04 GamingPC kernel: #PF: error_code(0x0003) - permissions violation
Jan 29 12:58:04 GamingPC kernel: PGD 3e7c15067 P4D 3e7c15067 PUD 3e7c16063 PMD 3e60000e1
Jan 29 12:58:04 GamingPC kernel: Oops: 0003 [#2] PREEMPT SMP PTI
Jan 29 12:58:04 GamingPC kernel: CPU: 0 PID: 611 Comm: irq/143-nvidia Tainted: P D OE 5.10.11-zen2-1-zen #1
Jan 29 12:58:04 GamingPC kernel: Hardware name: System manufacturer System Product Name/Z170I PRO GAMING, BIOS 3805 05/16/2018
Jan 29 12:58:04 GamingPC kernel: RIP: 0010:mutex_lock+0x10/0x20
Jan 29 12:58:04 GamingPC kernel: Code: 03 31 c0 c3 eb d4 0f 1f 40 00 0f 1f 44 00 00 be 02 00 00 00 e9 81 fa ff ff 90 0f 1f 44 00 00 31 c0 65 48 8b 14 25 c0 7b 01 00 <f0> 48 0f b1 17 75 01 c3 eb d6 66 0f 1f 44 00 00 0f 1f 44 00 00 41
Jan 29 12:58:04 GamingPC kernel: RSP: 0018:ffffabc341c13e28 EFLAGS: 00010246
Jan 29 12:58:04 GamingPC kernel: RAX: 0000000000000000 RBX: ffffffffa04c92dc RCX: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: RDX: ffff99fe24205dc0 RSI: ffffffffa04cae1c RDI: ffffffffa04c9c34
Jan 29 12:58:04 GamingPC kernel: RBP: ffffffffa04c9c34 R08: 0000000000000001 R09: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: R10: ffff99fe1af35c00 R11: 0000000000000000 R12: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: R13: 0000000000000001 R14: 0000000000000001 R15: ffff99fe24205dc0
Jan 29 12:58:04 GamingPC kernel: FS: 0000000000000000(0000) GS:ffff9a0126c00000(0000) knlGS:0000000000000000
Jan 29 12:58:04 GamingPC kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 29 12:58:04 GamingPC kernel: CR2: ffffffffa04c9c34 CR3: 00000002f9dc0006 CR4: 00000000003726f0
Jan 29 12:58:04 GamingPC kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 29 12:58:04 GamingPC kernel: Call Trace:
Jan 29 12:58:04 GamingPC kernel: perf_event_exit_task+0x35/0xae0
Jan 29 12:58:04 GamingPC kernel: ? task_work_run+0x5c/0x90
Jan 29 12:58:04 GamingPC kernel: ? task_work_run+0x5c/0x90
Jan 29 12:58:04 GamingPC kernel: do_exit+0x358/0xae0
Jan 29 12:58:04 GamingPC kernel: ? irq_set_irq_wake+0x1b0/0x1b0
Jan 29 12:58:04 GamingPC kernel: ? kthread+0x181/0x1b0
Jan 29 12:58:04 GamingPC kernel: rewind_stack_do_exit+0x17/0x17
Jan 29 12:58:04 GamingPC kernel: RIP: 0000:0x0
Jan 29 12:58:04 GamingPC kernel: Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
Jan 29 12:58:04 GamingPC kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: Modules linked in: macvtap macvlan vhost_net vhost vhost_iotlb tap tun snd_seq_dummy snd_hrtimer snd_seq snd_seq_device xt_CHECKSUM xt_MASQUERADE nvidia_drm(POE) nvidia_modeset(POE) ip6table_mangle ip6tab>
Jan 29 12:58:04 GamingPC kernel: sparse_keymap mxm_wmi crypto_simd cryptd snd_timer glue_helper rapl mei_me intel_cstate i2c_i801 intel_uncore cec pcspkr snd rfkill e1000e i2c_smbus syscopyarea nf_log_ipv6 sysfillrect sysimgblt ip6t_REJ>
Jan 29 12:58:04 GamingPC kernel: CR2: ffffffffa04c9c34
Jan 29 12:58:04 GamingPC kernel: ---[ end trace 3f17b3d029e863a9 ]---
Jan 29 12:58:04 GamingPC kernel: RIP: 0010:_nv013013rm+0xd8/0x130 [nvidia]
Jan 29 12:58:04 GamingPC kernel: Code: 40 00 31 c0 5b 41 5c 41 5d c3 0f 1f 84 00 00 00 00 00 48 c7 46 38 00 00 00 00 48 89 f7 45 31 ed e8 dd 7a ff ff eb 9b 0f 1f 00 <49> 8b 7c 24 08 e8 de 29 00 00 48 85 c0 74 b4 49 83 7c 24 08 00 74
Jan 29 12:58:04 GamingPC kernel: RSP: 0018:ffffabc341c13c48 EFLAGS: 00010246
Jan 29 12:58:04 GamingPC kernel: RAX: 0000000000000001 RBX: ffff99fe21215c18 RCX: 0000000000000010
Jan 29 12:58:04 GamingPC kernel: RDX: ffff99fe22d7cb58 RSI: 00000000004789f2 RDI: ffff9a01244e8ff0
Jan 29 12:58:04 GamingPC kernel: RBP: ffff99fe21215ba0 R08: ffffffffc2f10960 R09: ffff99fe21215af0
Jan 29 12:58:04 GamingPC kernel: R10: ffff99fe266a0008 R11: ffff99fe266a1098 R12: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: R13: 0000000000000001 R14: 00000000beef0003 R15: ffff99fe21215c98
Jan 29 12:58:04 GamingPC kernel: FS: 0000000000000000(0000) GS:ffff9a0126c00000(0000) knlGS:0000000000000000
Jan 29 12:58:04 GamingPC kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 29 12:58:04 GamingPC kernel: CR2: ffffffffffffffd6 CR3: 00000002f9dc0006 CR4: 00000000003726f0
Jan 29 12:58:04 GamingPC kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 29 12:58:04 GamingPC kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 29 12:58:04 GamingPC kernel: Fixing recursive fault but reboot is needed!
Come on Nvidia this has been happening for months!
I can confirm that the bug still persists on Arch Linux with kernel 5.10.11 and nvidia 460.39.
crash.txt (11.9 KB)
Edit: Whatâs the safest way of restarting the computer after one of these crashes? This is the second time I have experienced this while running chromium, and so far the only method that works for me is pressing the power button. These crashes also seem to coincide with when I started running chromium without the ââuse-gl=desktopâ flag because it was causing trouble with playback speeds on YouTube.
abelits
February 1, 2021, 7:18am
155
ssh to it from another computer, and reboot from there.
ssh to it from another computer, and reboot from there.
doesnât even allow me to do that :(
i mean, i can ssh into the pc and run shutdown, but it just kills the ssh session and doesnât restart the pc
tried: sudo shutdown -r now
and sudo shutdown now
Try using the SysRQ Key to give instructions to the kernel directly. That works for me.
More info: Magic SysRq key - Wikipedia
1 Like
amrits
February 1, 2021, 4:51pm
158
Fix is supposed to be integrated in upcoming 460 driver release.
Will notify here once again once it will be posted publicly.
2 Likes
sparbtw
February 2, 2021, 6:23pm
160
Itâs very sad to know that NVIDIA almost has no interest in solving this problemâŠ
By the way⊠I can confirm that today, 02/02/21 with kernel 5.10.11-arch1-1, i encountered the same problem (minimum twice a day)
(Yes, itâs the same journalctl log of the others⊠BUG: kernel NULL pointer dereference etc etc)âŠ
Yâall know what Torvalds have said, which iâd like to say now. There is no need to specify.
Mart
February 2, 2021, 9:09pm
161
Well⊠no interest in reading?
sparbtw
February 3, 2021, 1:44pm
162
I said almost, does a bug take 5 months to be fixed?
In my home this means no interest.
I agree, they would fixed it in a week if they wanted, so⊠torvalds quote
Can we get a status update�
We still havenât been able to reproduce the problem even with targeted testing, but have made another change that should help that will be in the next release.
Without an in-house reproduction of the problem itâs impossible to verify that itâs actually fixed, so itâs taking longer than usual for a fix to be released. I realize this situation is frustrating for everyone and apologize for the inconvenience.
2 Likes
ddimi
February 14, 2021, 7:56am
166
I install the driver with ââno-unified-memoryâ and the problem disapear.
Arch Linux,
Kernel 5.10.15 , nvidia 460.39
@ddimi , Whatâs the most convenient way to archieve it in Arch? With or without dkms?