The following kernel bug report occurs intermittently when entering system suspend.
------------[ cut here ]------------
list_add corruption. prev is NULL.
WARNING: CPU: 0 PID: 11528 at lib/list_debug.c:25 __list_add_valid_or_report+0x42/0xa0
Modules linked in: hid_logitech_hidpp uhid rfcomm snd_seq_dummy snd_hrtimer snd_seq ccm cmac algif_hash algif_skcipher af_alg bnep snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device nvidia_drm(POE) nvidia_uvm(POE) nvidia_modeset(POE) btusb btrtl btintel uvcvideo btbcm videobuf2_vmalloc btmtk uvc videobuf2_memops videobuf2_v4l2 bluetooth videodev videobuf2_common ecdh_generic mc crc16 intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_soc_avs snd_hda_codec_hdmi snd_soc_hda_codec iwlmvm snd_hda_ext_core snd_ctl_led kvm snd_hda_codec_realtek snd_soc_core mac80211 snd_hda_codec_generic irqbypass crct10dif_pclmul snd_compress crc32_pclmul ac97_bus joydev polyval_clmulni snd_pcm_dmaengine polyval_generic mousedev dell_rbtn libarc4 gf128mul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg sha512_ssse3 snd_intel_sdw_acpi aesni_intel dell_laptop snd_hda_codec crypto_simd snd_hda_core cryptd hid_multitouch dell_wmi snd_hwdep iTCO_wdt rapl nls_iso8859_1 iwlwifi
intel_pmc_bxt dell_smbios snd_pcm ee1004 processor_thermal_device_pci_legacy intel_cstate vfat iTCO_vendor_support dcdbas processor_thermal_device mei_wdt mei_pxp mei_hdcp fat intel_rapl_msr dell_smm_hwmon intel_uncore psmouse dell_wmi_descriptor ledtrig_audio wmi_bmof pcspkr intel_wmi_thunderbolt i2c_i801 processor_thermal_rfim snd_timer cfg80211 i2c_smbus processor_thermal_mbox snd intel_lpss_pci mei_me i2c_hid_acpi processor_thermal_rapl intel_lpss intel_rapl_common int3403_thermal int3400_thermal soundcore rfkill mei i2c_hid intel_hid idma64 intel_soc_dts_iosf intel_pch_thermal acpi_thermal_rel int340x_thermal_zone nvidia(POE) sparse_keymap acpi_pad mac_hid i2c_dev fuse crypto_user loop dm_mod ip_tables x_tables usbhid btrfs i915 blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw i2c_algo_bit atkbd rtsx_pci_sdmmc drm_buddy libps2 mmc_core vivaldi_fmap ttm nvme intel_gtt crc32c_intel mxm_wmi nvme_core drm_display_helper xhci_pci rtsx_pci nvme_common cec xhci_pci_renesas i8042 video serio
wmi
CPU: 0 PID: 11528 Comm: nvidia-sleep.sh Tainted: P IOE 6.6.1-arch1-1 #1 be166a630cd909acf8820643140e9106c6ea80e6
Hardware name: Dell Inc. Precision 5520/0X41RR, BIOS 1.18.0 11/17/2019
RIP: 0010:__list_add_valid_or_report+0x42/0xa0
Code: 75 41 4c 8b 02 49 39 c0 75 4c 48 39 fa 74 60 49 39 f8 74 5b b8 01 00 00 00 c3 cc cc cc cc 48 c7 c7 98 33 49 95 e8 7e 14 a6 ff <0f> 0b 31 c0 c3 cc cc cc cc 48 c7 c7 c0 33 49 95 e8 69 14 a6 ff 0f
RSP: 0018:ffffc900079d3ba8 EFLAGS: 00010082
RAX: 0000000000000000 RBX: ffffc900011d12b0 RCX: 0000000000000027
RDX: ffff88846e421708 RSI: 0000000000000001 RDI: ffff88846e421700
RBP: ffffc900079d3be0 R08: 0000000000000000 R09: ffffc900079d3a30
R10: 0000000000000003 R11: ffffffff95cca3c8 R12: 0000000000000246
R13: ffffc900011d12c0 R14: 0000000000000000 R15: ffff88811fb88000
FS: 00007f07fffd3740(0000) GS:ffff88846e400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0800226650 CR3: 00000004622fe001 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __list_add_valid_or_report+0x42/0xa0
? __warn+0x81/0x130
? __list_add_valid_or_report+0x42/0xa0
? report_bug+0x171/0x1a0
? prb_read_valid+0x1b/0x30
? handle_bug+0x3c/0x80
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x1a/0x20
? __list_add_valid_or_report+0x42/0xa0
? __list_add_valid_or_report+0x42/0xa0
_raw_q_schedule+0x3d/0xa0 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
nv_kthread_q_flush+0x7b/0x140 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
? __pfx__q_flush_function+0x10/0x10 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
uvm_suspend+0x9f/0x190 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
uvm_suspend_entry.part.0+0x4e/0xa0 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
? kmem_cache_free+0x22/0x3a0
nv_uvm_suspend+0x2e/0x50 [nvidia 3ce5cfbe99895ad472b4c9f14570b8cea8f3f96a]
nv_set_system_power_state+0x3bb/0x470 [nvidia 3ce5cfbe99895ad472b4c9f14570b8cea8f3f96a]
nv_procfs_write_suspend+0xe8/0x160 [nvidia 3ce5cfbe99895ad472b4c9f14570b8cea8f3f96a]
proc_reg_write+0x5a/0xa0
vfs_write+0xef/0x420
ksys_write+0x6f/0xf0
do_syscall_64+0x5d/0x90
? handle_mm_fault+0xa2/0x360
? do_user_addr_fault+0x30f/0x660
? exc_page_fault+0x7f/0x180
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f0800151034
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
RSP: 002b:00007ffc4486d4e8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f0800151034
RDX: 0000000000000008 RSI: 000055d8112e65b0 RDI: 0000000000000001
RBP: 000055d8112e65b0 R08: 0000000000000410 R09: 0000000000000001
R10: 0000000000000004 R11: 0000000000000202 R12: 0000000000000008
R13: 00007f08002265c0 R14: 00007f0800223f20 R15: 0000000000000000
</TASK>
---[ end trace 0000000000000000 ]---
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 11528 Comm: nvidia-sleep.sh Tainted: P W IOE 6.6.1-arch1-1 #1 be166a630cd909acf8820643140e9106c6ea80e6
Hardware name: Dell Inc. Precision 5520/0X41RR, BIOS 1.18.0 11/17/2019
RIP: 0010:__list_del_entry_valid_or_report+0x4/0xe0
Code: 48 89 c1 48 89 fe 48 c7 c7 88 34 49 95 e8 24 14 a6 ff 0f 0b eb a4 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 8b 17 48 8b 4f 08 48 85 d2 74 3e 48 85 c9 74 51 48 b8 00 01 00
RSP: 0018:ffffc900079d3b80 EFLAGS: 00010007
RAX: ffffc900011d12d0 RBX: 0000000000000000 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 0000000000000292 RDI: 0000000000000000
RBP: ffffc900079d3be0 R08: 0000000000000000 R09: ffffc900079d3a30
R10: 0000000000000003 R11: ffffffff95cca3c8 R12: 0000000000000246
R13: ffffc900011d12c0 R14: 0000000000000000 R15: ffff88811fb88000
FS: 00007f07fffd3740(0000) GS:ffff88846e400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000004622fe001 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die+0x23/0x70
? page_fault_oops+0x171/0x4e0
? __list_add_valid_or_report+0x42/0xa0
? __warn+0x9b/0x130
? exc_page_fault+0x7f/0x180
? asm_exc_page_fault+0x26/0x30
? __list_del_entry_valid_or_report+0x4/0xe0
__up.isra.0+0xe/0x50
up+0x44/0x60
_raw_q_schedule+0x64/0xa0 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
nv_kthread_q_flush+0x7b/0x140 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
? __pfx__q_flush_function+0x10/0x10 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
uvm_suspend+0x9f/0x190 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
uvm_suspend_entry.part.0+0x4e/0xa0 [nvidia_uvm 56e3a52a4ae3c6eebb72d1602e42807b69a9ce07]
? kmem_cache_free+0x22/0x3a0
nv_uvm_suspend+0x2e/0x50 [nvidia 3ce5cfbe99895ad472b4c9f14570b8cea8f3f96a]
nv_set_system_power_state+0x3bb/0x470 [nvidia 3ce5cfbe99895ad472b4c9f14570b8cea8f3f96a]
nv_procfs_write_suspend+0xe8/0x160 [nvidia 3ce5cfbe99895ad472b4c9f14570b8cea8f3f96a]
proc_reg_write+0x5a/0xa0
vfs_write+0xef/0x420
ksys_write+0x6f/0xf0
do_syscall_64+0x5d/0x90
? handle_mm_fault+0xa2/0x360
? do_user_addr_fault+0x30f/0x660
? exc_page_fault+0x7f/0x180
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f0800151034
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
RSP: 002b:00007ffc4486d4e8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f0800151034
RDX: 0000000000000008 RSI: 000055d8112e65b0 RDI: 0000000000000001
RBP: 000055d8112e65b0 R08: 0000000000000410 R09: 0000000000000001
R10: 0000000000000004 R11: 0000000000000202 R12: 0000000000000008
R13: 00007f08002265c0 R14: 00007f0800223f20 R15: 0000000000000000
</TASK>
Modules linked in: hid_logitech_hidpp uhid rfcomm snd_seq_dummy snd_hrtimer snd_seq ccm cmac algif_hash algif_skcipher af_alg bnep snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device nvidia_drm(POE) nvidia_uvm(POE) nvidia_modeset(POE) btusb btrtl btintel uvcvideo btbcm videobuf2_vmalloc btmtk uvc videobuf2_memops videobuf2_v4l2 bluetooth videodev videobuf2_common ecdh_generic mc crc16 intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_soc_avs snd_hda_codec_hdmi snd_soc_hda_codec iwlmvm snd_hda_ext_core snd_ctl_led kvm snd_hda_codec_realtek snd_soc_core mac80211 snd_hda_codec_generic irqbypass crct10dif_pclmul snd_compress crc32_pclmul ac97_bus joydev polyval_clmulni snd_pcm_dmaengine polyval_generic mousedev dell_rbtn libarc4 gf128mul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg sha512_ssse3 snd_intel_sdw_acpi aesni_intel dell_laptop snd_hda_codec crypto_simd snd_hda_core cryptd hid_multitouch dell_wmi snd_hwdep iTCO_wdt rapl nls_iso8859_1 iwlwifi
intel_pmc_bxt dell_smbios snd_pcm ee1004 processor_thermal_device_pci_legacy intel_cstate vfat iTCO_vendor_support dcdbas processor_thermal_device mei_wdt mei_pxp mei_hdcp fat intel_rapl_msr dell_smm_hwmon intel_uncore psmouse dell_wmi_descriptor ledtrig_audio wmi_bmof pcspkr intel_wmi_thunderbolt i2c_i801 processor_thermal_rfim snd_timer cfg80211 i2c_smbus processor_thermal_mbox snd intel_lpss_pci mei_me i2c_hid_acpi processor_thermal_rapl intel_lpss intel_rapl_common int3403_thermal int3400_thermal soundcore rfkill mei i2c_hid intel_hid idma64 intel_soc_dts_iosf intel_pch_thermal acpi_thermal_rel int340x_thermal_zone nvidia(POE) sparse_keymap acpi_pad mac_hid i2c_dev fuse crypto_user loop dm_mod ip_tables x_tables usbhid btrfs i915 blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw i2c_algo_bit atkbd rtsx_pci_sdmmc drm_buddy libps2 mmc_core vivaldi_fmap ttm nvme intel_gtt crc32c_intel mxm_wmi nvme_core drm_display_helper xhci_pci rtsx_pci nvme_common cec xhci_pci_renesas i8042 video serio
wmi
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:__list_del_entry_valid_or_report+0x4/0xe0
Code: 48 89 c1 48 89 fe 48 c7 c7 88 34 49 95 e8 24 14 a6 ff 0f 0b eb a4 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 8b 17 48 8b 4f 08 48 85 d2 74 3e 48 85 c9 74 51 48 b8 00 01 00
RSP: 0018:ffffc900079d3b80 EFLAGS: 00010007
RAX: ffffc900011d12d0 RBX: 0000000000000000 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 0000000000000292 RDI: 0000000000000000
RBP: ffffc900079d3be0 R08: 0000000000000000 R09: ffffc900079d3a30
R10: 0000000000000003 R11: ffffffff95cca3c8 R12: 0000000000000246
R13: ffffc900011d12c0 R14: 0000000000000000 R15: ffff88811fb88000
FS: 00007f07fffd3740(0000) GS:ffff88846e400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000004622fe001 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
note: nvidia-sleep.sh[11528] exited with irqs disabled
note: nvidia-sleep.sh[11528] exited with preempt_count 1
PM: suspend entry (deep)
Filesystems sync: 0.054 seconds
I have option NVreg_DynamicPowerManagement=0x02, but otherwise nothing else is set.
System can be interacted with after this happens but is mostly unusable for many other drivers (networking, etc.) and requires a reboot.
Workaround is to blacklist nvidia-uvm. Driver version is 545.29.02.
nvidia-bug-report.log.gz (714.0 KB)