Nvidia driver (570.144) crash on suspend and freeze whole system

Hi,
I use CachyOS Linux on PC and I can’t use sleep feature because nvidia driver freeze whole system and turning off screens and does not react at all except sysreq keys after sleep mode is initiating.
Using Wayland and KDE and multi monitor setup.
Card: NVIDIA GeForce GTX 970

máj 19 20:31:44 cachyos-x8664 systemd[1]: Reached target Sleep.
máj 19 20:31:44 cachyos-x8664 systemd[1]: Starting NVIDIA system suspend actions...
máj 19 20:31:44 cachyos-x8664 suspend[814258]: nvidia-suspend.service
máj 19 20:31:44 cachyos-x8664 logger[814258]: <13>May 19 20:31:44 suspend: nvidia-suspend.service
máj 19 20:31:44 cachyos-x8664 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
máj 19 20:31:44 cachyos-x8664 kernel: fbcon: Taking over console
máj 19 20:31:44 cachyos-x8664 kernel: #PF: supervisor read access in kernel mode
máj 19 20:31:44 cachyos-x8664 kernel: #PF: error_code(0x0000) - not-present page
máj 19 20:31:44 cachyos-x8664 kernel: PGD 0 P4D 0 
máj 19 20:31:44 cachyos-x8664 kernel: Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
máj 19 20:31:44 cachyos-x8664 kernel: CPU: 5 UID: 0 PID: 814260 Comm: nvidia-sleep.sh Tainted: P           OE      6.14.6-2-cachyos #1 fa016bde76e6b659f51ba29fad589bba022e5469
máj 19 20:31:44 cachyos-x8664 kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
máj 19 20:31:44 cachyos-x8664 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350M Pro4, BIOS P6.60 07/27/2020
máj 19 20:31:44 cachyos-x8664 kernel: RIP: 0010:_raw_q_flush+0x88/0x110 [nvidia_uvm]
máj 19 20:31:44 cachyos-x8664 kernel: Code: 38 4c 8d 77 10 4c 89 f7 e8 a5 c6 c6 c4 48 8b 4c 24 20 4c 39 f9 75 67 48 8b 73 08 4c 39 fe 0f 95 c1 49 39 df 74 65 84 c9 74 61 <48> 39 1e 75 5c 4c 89 7b 08 48 89 5c 24 20 48 89 74 24 28 4c 89 3e
máj 19 20:31:44 cachyos-x8664 kernel: RSP: 0018:ffff9f692038f9d8 EFLAGS: 00010002
máj 19 20:31:44 cachyos-x8664 kernel: RAX: 0000000000000296 RBX: ffff9f691b7b82f0 RCX: ffff9f692038f901
máj 19 20:31:44 cachyos-x8664 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f691b7b8300
máj 19 20:31:44 cachyos-x8664 kernel: RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000002
máj 19 20:31:44 cachyos-x8664 kernel: R10: 0000000000000001 R11: 00000000000bea67 R12: 00000000000002e8
máj 19 20:31:44 cachyos-x8664 kernel: R13: ffff9f691b7b8008 R14: ffff9f691b7b8300 R15: ffff9f692038f9f8
máj 19 20:31:44 cachyos-x8664 kernel: FS:  000072fae4b5ab80(0000) GS:ffff925a5ee80000(0000) knlGS:0000000000000000
máj 19 20:31:44 cachyos-x8664 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
máj 19 20:31:44 cachyos-x8664 kernel: CR2: 0000000000000000 CR3: 00000002d0051000 CR4: 0000000000350ef0
máj 19 20:31:44 cachyos-x8664 kernel: Call Trace:
máj 19 20:31:44 cachyos-x8664 kernel:  <TASK>
máj 19 20:31:44 cachyos-x8664 kernel:  ? __pfx__q_flush_function+0x10/0x10 [nvidia_uvm f2407e734c3f88536c840e304ab2e0b6ff9d863e]
máj 19 20:31:44 cachyos-x8664 kernel:  nv_kthread_q_flush+0x18/0x70 [nvidia_uvm f2407e734c3f88536c840e304ab2e0b6ff9d863e]
máj 19 20:31:44 cachyos-x8664 kernel:  uvm_suspend+0x17b/0x1a0 [nvidia_uvm f2407e734c3f88536c840e304ab2e0b6ff9d863e]
máj 19 20:31:44 cachyos-x8664 kernel:  uvm_suspend_entry+0xb7/0xf0 [nvidia_uvm f2407e734c3f88536c840e304ab2e0b6ff9d863e]
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? down+0x1c/0x3a
máj 19 20:31:44 cachyos-x8664 kernel:  nv_uvm_suspend+0x32/0x50 [nvidia aa9492d42131ba4ed9484947a3e9537d2b71fffc]
máj 19 20:31:44 cachyos-x8664 kernel:  nv_set_system_power_state+0x319/0x4d0 [nvidia aa9492d42131ba4ed9484947a3e9537d2b71fffc]
máj 19 20:31:44 cachyos-x8664 kernel:  nv_procfs_write_suspend+0x129/0x160 [nvidia aa9492d42131ba4ed9484947a3e9537d2b71fffc]
máj 19 20:31:44 cachyos-x8664 kernel:  proc_reg_write+0x57/0xb0
máj 19 20:31:44 cachyos-x8664 kernel:  __x64_sys_write+0x3e9/0x400
máj 19 20:31:44 cachyos-x8664 kernel:  do_syscall_64+0x85/0x134
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? do_syscall_64+0x91/0x134
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? syscall_exit_to_user_mode+0x34/0x9f
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? do_syscall_64+0x91/0x134
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? do_syscall_64+0x91/0x134
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? set_close_on_exec+0x33/0x60
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? ptep_set_access_flags+0x27/0x40
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? do_wp_page+0x7de/0x900
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? syscall_exit_to_user_mode+0x34/0x9f
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? do_syscall_64+0x91/0x134
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? __count_memcg_events+0x5a/0xd0
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? handle_mm_fault+0x5cf/0x850
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? do_user_addr_fault+0x1df/0x420
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
máj 19 20:31:44 cachyos-x8664 kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
máj 19 20:31:44 cachyos-x8664 kernel: RIP: 0033:0x72fae48a34ef
máj 19 20:31:44 cachyos-x8664 kernel: Code: 04 00 00 00 48 8b 15 18 a8 16 00 64 89 02 48 c7 c2 ff ff ff ff 48 83 c4 10 48 89 d0 5b c3 0f 1f 44 00 00 48 8b 44 24 20 0f 05 <48> 63 d0 3d 00 f0 ff ff 77 0f 48 83 c4 10 48 89 d0 5b c3 66 0f 1f
máj 19 20:31:44 cachyos-x8664 kernel: RSP: 002b:00007fff968279f0 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
máj 19 20:31:44 cachyos-x8664 kernel: RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 000072fae48a34ef
máj 19 20:31:44 cachyos-x8664 kernel: RDX: 0000000000000008 RSI: 00005935926f61c0 RDI: 0000000000000001
máj 19 20:31:44 cachyos-x8664 kernel: RBP: 00005935926f61c0 R08: 0000000000000000 R09: 0000000000000000
máj 19 20:31:44 cachyos-x8664 kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000008
máj 19 20:31:44 cachyos-x8664 kernel: R13: 000072fae4a0f5c0 R14: 00005935926f61c0 R15: 0000000000000000
máj 19 20:31:44 cachyos-x8664 kernel:  </TASK>
máj 19 20:31:44 cachyos-x8664 kernel: Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfkill nct6775 nct6775_core hwmon_vid vfat fat amd_atl intel_rapl_msr intel_rapl_common kvm_amd snd_hda_codec_realtek ee1004 snd_hda_scodec_component kvm snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel irqbypass snd_intel_dspcfg polyval_clmulni snd_intel_sdw_acpi polyval_generic ghash_clmulni_intel snd_hda_codec sha512_ssse3 sha256_ssse3 snd_hda_core sha1_ssse3 aesni_intel snd_hwdep crypto_simd cryptd r8169 joydev mousedev snd_pcm realtek rapl wmi_bmof mdio_devres snd_timer pcspkr snd libphy i2c_piix4 k10temp gpio_amdpt acpi_cpufreq soundcore i2c_smbus ccp gpio_generic ip6t_REJECT nf_reject_ipv6 mac_hid xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables pkcs8_key_parser ntsync i2c_dev crypto_user dm_mod loop nfnetlink lz4 zram 842_decompress 842_compress lz4hc_compress lz4_compress ip_tables x_tables
máj 19 20:31:44 cachyos-x8664 kernel:  hid_generic nvme usbhid nvme_core nvme_auth nvidia_drm(POE) drm_ttm_helper ttm nvidia_uvm(POE) nvidia_modeset(POE) video wmi nvidia(POE)
máj 19 20:31:44 cachyos-x8664 kernel: CR2: 0000000000000000
máj 19 20:31:44 cachyos-x8664 kernel: ---[ end trace 0000000000000000 ]---
máj 19 20:31:44 cachyos-x8664 kernel: RIP: 0010:_raw_q_flush+0x88/0x110 [nvidia_uvm]
máj 19 20:31:44 cachyos-x8664 kernel: Code: 38 4c 8d 77 10 4c 89 f7 e8 a5 c6 c6 c4 48 8b 4c 24 20 4c 39 f9 75 67 48 8b 73 08 4c 39 fe 0f 95 c1 49 39 df 74 65 84 c9 74 61 <48> 39 1e 75 5c 4c 89 7b 08 48 89 5c 24 20 48 89 74 24 28 4c 89 3e
máj 19 20:31:44 cachyos-x8664 kernel: RSP: 0018:ffff9f692038f9d8 EFLAGS: 00010002
máj 19 20:31:44 cachyos-x8664 kernel: RAX: 0000000000000296 RBX: ffff9f691b7b82f0 RCX: ffff9f692038f901
máj 19 20:31:44 cachyos-x8664 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f691b7b8300
máj 19 20:31:44 cachyos-x8664 kernel: RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000002
máj 19 20:31:44 cachyos-x8664 kernel: R10: 0000000000000001 R11: 00000000000bea67 R12: 00000000000002e8
máj 19 20:31:44 cachyos-x8664 kernel: R13: ffff9f691b7b8008 R14: ffff9f691b7b8300 R15: ffff9f692038f9f8
máj 19 20:31:44 cachyos-x8664 kernel: FS:  000072fae4b5ab80(0000) GS:ffff925a5ee80000(0000) knlGS:0000000000000000
máj 19 20:31:44 cachyos-x8664 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
máj 19 20:31:44 cachyos-x8664 kernel: CR2: 0000000000000000 CR3: 00000002d0051000 CR4: 0000000000350ef0
máj 19 20:31:44 cachyos-x8664 kernel: note: nvidia-sleep.sh[814260] exited with irqs disabled
máj 19 20:31:44 cachyos-x8664 kernel: note: nvidia-sleep.sh[814260] exited with preempt_count 1
máj 19 20:31:44 cachyos-x8664 kernel: Console: switching to colour frame buffer device 240x67
máj 19 20:31:44 cachyos-x8664 systemd[1]: nvidia-suspend.service: Main process exited, code=killed, status=9/KILL
máj 19 20:31:44 cachyos-x8664 systemd[1]: nvidia-suspend.service: Failed with result 'signal'.
máj 19 20:31:44 cachyos-x8664 systemd[1]: Failed to start NVIDIA system suspend actions.
máj 19 20:31:44 cachyos-x8664 systemd[1]: Starting System Suspend...
máj 19 20:31:44 cachyos-x8664 systemd[1]: systemd-vconsole-setup.service: Deactivated successfully.
máj 19 20:31:44 cachyos-x8664 systemd[1]: Stopped Virtual Console Setup.
máj 19 20:31:44 cachyos-x8664 systemd[1]: Stopping Virtual Console Setup...
máj 19 20:31:44 cachyos-x8664 systemd[1]: Starting Virtual Console Setup...
máj 19 20:31:44 cachyos-x8664 systemd-sleep[814286]: User sessions remain unfrozen on explicit request ($SYSTEMD_SLEEP_FREEZE_USER_SESSIONS=0).
máj 19 20:31:44 cachyos-x8664 systemd-sleep[814286]: This is not recommended, and might result in unexpected behavior, particularly
máj 19 20:31:44 cachyos-x8664 systemd-sleep[814286]: in suspend-then-hibernate operations or setups with encrypted home directories.
máj 19 20:31:44 cachyos-x8664 systemd-sleep[814286]: Performing sleep operation 'suspend'...
máj 19 20:31:44 cachyos-x8664 kernel: PM: suspend entry (deep)
máj 19 20:31:44 cachyos-x8664 systemd[1]: Finished Virtual Console Setup.
máj 19 20:31:44 cachyos-x8664 kernel: Filesystems sync: 0.067 seconds
máj 19 20:31:44 cachyos-x8664 kernel: Freezing user space processes
máj 19 20:31:44 cachyos-x8664 kernel: Freezing user space processes completed (elapsed 0.002 seconds)
máj 19 20:31:44 cachyos-x8664 kernel: OOM killer disabled.
máj 19 20:31:44 cachyos-x8664 kernel: Freezing remaining freezable tasks
máj 19 20:31:44 cachyos-x8664 kernel: Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
máj 19 20:31:44 cachyos-x8664 kernel: printk: Suspending console(s) (use no_console_suspend to debug)
máj 19 20:31:44 cachyos-x8664 kernel: NVRM: GPU 0000:07:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 'Configuring Power Management Support' section in the driver README.
máj 19 20:31:44 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: pci_pm_suspend(): nv_pmops_suspend [nvidia] returns -5
máj 19 20:31:44 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: dpm_run_callback(): pci_pm_suspend.llvm.723607674040628562 returns -5
máj 19 20:31:44 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: failed to suspend async: error -5
máj 19 20:31:44 cachyos-x8664 kernel: sd 4:0:0:0: [sdc] Synchronizing SCSI cache
máj 19 20:31:44 cachyos-x8664 kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
máj 19 20:31:44 cachyos-x8664 kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
máj 19 20:31:44 cachyos-x8664 kernel: PM: Some devices failed to suspend, or early wake event detected
máj 19 20:31:44 cachyos-x8664 kernel: nvme nvme0: 12/0/0 default/read/poll queues
máj 19 20:31:44 cachyos-x8664 kernel: OOM killer enabled.
máj 19 20:31:44 cachyos-x8664 kernel: Restarting tasks ... done.
máj 19 20:31:44 cachyos-x8664 kernel: random: crng reseeded on system resumption
máj 19 20:31:44 cachyos-x8664 kernel: PM: suspend exit
máj 19 20:31:44 cachyos-x8664 kernel: PM: suspend entry (s2idle)
máj 19 20:31:45 cachyos-x8664 kernel: Filesystems sync: 0.014 seconds
máj 19 20:31:45 cachyos-x8664 kernel: Freezing user space processes
máj 19 20:31:45 cachyos-x8664 kernel: Freezing user space processes completed (elapsed 0.001 seconds)
máj 19 20:31:45 cachyos-x8664 kernel: OOM killer disabled.
máj 19 20:31:45 cachyos-x8664 kernel: Freezing remaining freezable tasks
máj 19 20:31:45 cachyos-x8664 kernel: Freezing remaining freezable tasks completed (elapsed 0.150 seconds)
máj 19 20:31:45 cachyos-x8664 kernel: printk: Suspending console(s) (use no_console_suspend to debug)
máj 19 20:31:45 cachyos-x8664 kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
máj 19 20:31:45 cachyos-x8664 kernel: sd 4:0:0:0: [sdc] Synchronizing SCSI cache
máj 19 20:31:45 cachyos-x8664 kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
máj 19 20:31:45 cachyos-x8664 kernel: ata5.00: Entering standby power mode
máj 19 20:31:45 cachyos-x8664 kernel: ata1.00: Entering standby power mode
máj 19 20:31:45 cachyos-x8664 kernel: ata2.00: Entering standby power mode
máj 19 20:31:45 cachyos-x8664 kernel: NVRM: GPU 0000:07:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 'Configuring Power Management Support' section in the driver README.
máj 19 20:31:45 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: pci_pm_suspend(): nv_pmops_suspend [nvidia] returns -5
máj 19 20:31:45 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: dpm_run_callback(): pci_pm_suspend.llvm.723607674040628562 returns -5
máj 19 20:31:45 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: failed to suspend async: error -5
máj 19 20:31:45 cachyos-x8664 kernel: ata6: SATA link down (SStatus 0 SControl 300)
máj 19 20:31:45 cachyos-x8664 kernel: ata10: SATA link down (SStatus 0 SControl 300)
máj 19 20:31:45 cachyos-x8664 kernel: ata9: SATA link down (SStatus 0 SControl 300)
máj 19 20:31:45 cachyos-x8664 kernel: PM: Some devices failed to suspend, or early wake event detected
máj 19 20:31:45 cachyos-x8664 kernel: OOM killer enabled.
máj 19 20:31:45 cachyos-x8664 kernel: Restarting tasks ... done.
máj 19 20:31:45 cachyos-x8664 kernel: random: crng reseeded on system resumption
máj 19 20:31:45 cachyos-x8664 kernel: PM: suspend exit
máj 19 20:31:45 cachyos-x8664 systemd-sleep[814286]: Failed to put system to sleep. System resumed again: Input/output error
máj 19 20:31:45 cachyos-x8664 kernel: ata10: SATA link down (SStatus 0 SControl 300)
máj 19 20:31:45 cachyos-x8664 kernel: ata6: SATA link down (SStatus 0 SControl 300)
máj 19 20:31:45 cachyos-x8664 kernel: ata9: SATA link down (SStatus 0 SControl 300)
máj 19 20:31:46 cachyos-x8664 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
máj 19 20:31:46 cachyos-x8664 kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
máj 19 20:31:46 cachyos-x8664 kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
máj 19 20:31:46 cachyos-x8664 kernel: sd 0:0:0:0: [sda] Starting disk
máj 19 20:31:46 cachyos-x8664 kernel: ata1.00: configured for UDMA/133
máj 19 20:31:46 cachyos-x8664 kernel: ata1.00: Entering active power mode
máj 19 20:31:46 cachyos-x8664 kernel: sd 4:0:0:0: [sdc] Starting disk
máj 19 20:31:46 cachyos-x8664 kernel: sd 1:0:0:0: [sdb] Starting disk
máj 19 20:31:46 cachyos-x8664 kernel: ata5.00: configured for UDMA/133
máj 19 20:31:46 cachyos-x8664 kernel: ata2.00: configured for UDMA/133
máj 19 20:31:46 cachyos-x8664 kernel: ata2.00: Entering active power mode
máj 19 20:31:51 cachyos-x8664 systemd-resolved[485]: Switching to fallback DNS server 1.1.1.1#cloudflare-dns.com.
máj 19 20:31:55 cachyos-x8664 systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
máj 19 20:31:57 cachyos-x8664 brave[1605]: [1605:1622:0519/203157.274702:ERROR:components/sync/engine/get_updates_processor.cc:267] PostClientToServerMessage() failed during GetUpdates with error Network error (ERR_INTERNET_DISCONNECTED)
máj 19 20:32:51 cachyos-x8664 systemd[1]: Received SIGINT.
máj 19 20:32:51 cachyos-x8664 systemd[1]: Requested transaction contradicts existing jobs: Transaction for reboot.target/start is destructive (restart.fancontrol-norbi.service has 'start' job queued, but 'stop' is included in transaction).
máj 19 20:32:51 cachyos-x8664 systemd[1]: Failed to enqueue replace-irreversibly job for ctrl-alt-del.target: Transaction for reboot.target/start is destructive (restart.fancontrol-norbi.service has 'start' job queued, but 'stop' is included in transaction).
máj 19 20:32:51 cachyos-x8664 systemd[1]: Received SIGINT.
máj 19 20:32:51 cachyos-x8664 systemd[1]: Requested transaction contradicts existing jobs: Transaction for reboot.target/start is destructive (restart.fancontrol-norbi.service has 'start' job queued, but 'stop' is included in transaction).
máj 19 20:32:51 cachyos-x8664 systemd[1]: Failed to enqueue replace-irreversibly job for ctrl-alt-del.target: Transaction for reboot.target/start is destructive (restart.fancontrol-norbi.service has 'start' job queued, but 'stop' is included in transaction).
máj 19 20:32:58 cachyos-x8664 wireplumber[962]: spa.bluez5: BlueZ system service is not available
máj 19 20:33:12 cachyos-x8664 brave[1605]: [1605:1622:0519/203312.283314:ERROR:components/sync/engine/get_updates_processor.cc:267] PostClientToServerMessage() failed during GetUpdates with error Network error (ERR_INTERNET_DISCONNECTED)
máj 19 20:33:15 cachyos-x8664 systemd[1]: systemd-suspend.service: Main process exited, code=exited, status=1/FAILURE

using related nvidia_drm.fbdev=1 kernel parameter only which I tried for other kfence driver crash but it didn’t help on either: https://forums.developer.nvidia.com/t/bug-kfence-use-after-free-read-in-nv000177kms-nvidia-modeset/325691/18

Also seeing weird garbage screen on booting system every time after the first graphical mode screen of CachyOS linux with the spining thing, but before KDE loading screen and the actual desktop.

nvidia-bug-report.log.gz (1.3 MB)

Edit:
kernel module parameters that I already tried and didn’t help:
NVreg_PreserveVideoMemoryAllocations=1
nvidia_drm modeset=1
nvidia_drm.fbdev=1

issue is still present with driver version 575.64

Hi @norbi78 , thanks for reporting this issue.
I will try to reproduce this issue locally and file a bug for this.

Thank you! It occurs every time on suspend with my GTX970 card, so I can’t use it. Only kernel sysreq keys can help to reboot system from this state.

system sleep blocking issue is still exists in driver version 575.64.03

júl 14 14:38:34 cachyos-x8664 systemd[1]: Reached target Sleep.
júl 14 14:38:34 cachyos-x8664 systemd[1]: Starting NVIDIA system suspend actions...
júl 14 14:38:34 cachyos-x8664 suspend[4178]: nvidia-suspend.service
júl 14 14:38:34 cachyos-x8664 logger[4178]: <13>Jul 14 14:38:34 suspend: nvidia-suspend.service
júl 14 14:38:34 cachyos-x8664 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
júl 14 14:38:34 cachyos-x8664 kernel: #PF: supervisor read access in kernel mode
júl 14 14:38:34 cachyos-x8664 kernel: #PF: error_code(0x0000) - not-present page
júl 14 14:38:34 cachyos-x8664 kernel: PGD 0 P4D 0 
júl 14 14:38:34 cachyos-x8664 kernel: Oops: Oops: 0000 [#1] SMP NOPTI
júl 14 14:38:34 cachyos-x8664 kernel: CPU: 4 UID: 0 PID: 4180 Comm: nvidia-sleep.sh Tainted: P           OE       6.15.6-2-cachyos #1 PREEMPT(full)  9d27375832d6f70505a6790bd3e58960ebf087b3
júl 14 14:38:34 cachyos-x8664 kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
júl 14 14:38:34 cachyos-x8664 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350M Pro4, BIOS P6.60 07/27/2020
júl 14 14:38:34 cachyos-x8664 kernel: RIP: 0010:nv_kthread_q_flush+0x9a/0x200 [nvidia_uvm]
júl 14 14:38:34 cachyos-x8664 kernel: Code: 24 18 48 c7 44 24 20 30 05 20 c7 4c 89 74 24 28 e8 db c7 bd da 48 89 c1 48 8b 44 24 10 4c 39 e0 0f 85 17 01 00 00 48 8b 73 08 <48> 39 1e 0f 85 d3 e1 0e 00 4c 39 e6 0f 84 ca e1 0e 00 48 89 74 24
júl 14 14:38:34 cachyos-x8664 kernel: RSP: 0018:ffffccf6dc11bc30 EFLAGS: 00010046
júl 14 14:38:34 cachyos-x8664 kernel: RAX: ffffccf6dc11bc40 RBX: ffffccf6dc48d2f0 RCX: 0000000000000246
júl 14 14:38:34 cachyos-x8664 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffccf6dc48d300
júl 14 14:38:34 cachyos-x8664 kernel: RBP: ffffccf6dc11bcb8 R08: ffffffffc741ecc0 R09: ffff890f1ef32000
júl 14 14:38:34 cachyos-x8664 kernel: R10: 0000000000000000 R11: 0000000000000004 R12: ffffccf6dc11bc40
júl 14 14:38:34 cachyos-x8664 kernel: R13: 0000000000000000 R14: ffffccf6dc11bc68 R15: ffffccf6dc11bc78
júl 14 14:38:34 cachyos-x8664 kernel: FS:  00007f2cb9ed8b80(0000) GS:ffff890f7b3d9000(0000) knlGS:0000000000000000
júl 14 14:38:34 cachyos-x8664 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
júl 14 14:38:34 cachyos-x8664 kernel: CR2: 0000000000000000 CR3: 00000002832f5000 CR4: 0000000000350ef0
júl 14 14:38:34 cachyos-x8664 kernel: Call Trace:
júl 14 14:38:34 cachyos-x8664 kernel:  <TASK>
júl 14 14:38:34 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 14 14:38:34 cachyos-x8664 kernel:  ? _nv051534rm+0x10/0x20 [nvidia 5d4b659a19408af46689da88b5908bf82ba704d2]
júl 14 14:38:34 cachyos-x8664 kernel:  ? __pfx__q_flush_function+0x10/0x10 [nvidia_uvm c9e413bfc5fb5aca0589c7bdb8a900849266eccb]
júl 14 14:38:34 cachyos-x8664 kernel:  uvm_suspend.isra.0+0xba/0x180 [nvidia_uvm c9e413bfc5fb5aca0589c7bdb8a900849266eccb]
júl 14 14:38:34 cachyos-x8664 kernel:  uvm_suspend_entry+0x7e/0xa0 [nvidia_uvm c9e413bfc5fb5aca0589c7bdb8a900849266eccb]
júl 14 14:38:34 cachyos-x8664 kernel:  nv_uvm_suspend+0x31/0x50 [nvidia 5d4b659a19408af46689da88b5908bf82ba704d2]
júl 14 14:38:34 cachyos-x8664 kernel:  nv_set_system_power_state+0x3d8/0x5b0 [nvidia 5d4b659a19408af46689da88b5908bf82ba704d2]
júl 14 14:38:34 cachyos-x8664 kernel:  nv_procfs_write_suspend+0x196/0x280 [nvidia 5d4b659a19408af46689da88b5908bf82ba704d2]
júl 14 14:38:34 cachyos-x8664 kernel:  ? __unregister_chrdev+0x80/0xe0
júl 14 14:38:34 cachyos-x8664 kernel:  proc_reg_write+0x42/0xa0
júl 14 14:38:34 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 14 14:38:34 cachyos-x8664 kernel:  vfs_write+0x11b/0x4f0
júl 14 14:38:34 cachyos-x8664 kernel:  __x64_sys_write+0x70/0xe0
júl 14 14:38:34 cachyos-x8664 kernel:  do_syscall_64+0x7b/0x810
júl 14 14:38:34 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 14 14:38:34 cachyos-x8664 kernel:  ? do_user_addr_fault+0x21c/0x910
júl 14 14:38:34 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 14 14:38:34 cachyos-x8664 kernel:  ? exc_page_fault+0x81/0x190
júl 14 14:38:34 cachyos-x8664 kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
júl 14 14:38:34 cachyos-x8664 kernel: RIP: 0033:0x7f2cb9ca34ef
júl 14 14:38:34 cachyos-x8664 kernel: Code: 04 00 00 00 48 8b 15 18 a8 16 00 64 89 02 48 c7 c2 ff ff ff ff 48 83 c4 10 48 89 d0 5b c3 0f 1f 44 00 00 48 8b 44 24 20 0f 05 <48> 63 d0 3d 00 f0 ff ff 77 0f 48 83 c4 10 48 89 d0 5b c3 66 0f 1f
júl 14 14:38:34 cachyos-x8664 kernel: RSP: 002b:00007ffea74efb10 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
júl 14 14:38:34 cachyos-x8664 kernel: RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f2cb9ca34ef
júl 14 14:38:34 cachyos-x8664 kernel: RDX: 0000000000000008 RSI: 000055ee389e73e0 RDI: 0000000000000001
júl 14 14:38:34 cachyos-x8664 kernel: RBP: 000055ee389e73e0 R08: 0000000000000000 R09: 0000000000000000
júl 14 14:38:34 cachyos-x8664 kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000008
júl 14 14:38:34 cachyos-x8664 kernel: R13: 00007f2cb9e0f5c0 R14: 000055ee389e73e0 R15: 0000000000000000
júl 14 14:38:34 cachyos-x8664 kernel:  </TASK>
júl 14 14:38:34 cachyos-x8664 kernel: Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace nfs_localio sunrpc netfs tcp_dctcp tcp_htcp tcp_bbr snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfkill nct6775 nct6775_core hwmon_vid amd_atl intel_rapl_msr intel_rapl_common vfat fat snd_hda_codec_realtek kvm_amd snd_hda_codec_generic ee1004 snd_hda_scodec_component snd_hda_codec_hdmi kvm irqbypass polyval_clmulni snd_hda_intel polyval_generic snd_intel_dspcfg ghash_clmulni_intel sha512_ssse3 snd_intel_sdw_acpi sha256_ssse3 snd_hda_codec sha1_ssse3 aesni_intel snd_hda_core crypto_simd r8169 cryptd snd_hwdep rapl snd_pcm joydev wmi_bmof mousedev realtek pcspkr acpi_cpufreq mdio_devres snd_timer snd i2c_piix4 k10temp libphy soundcore i2c_smbus gpio_amdpt ccp gpio_generic mac_hid ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables pkcs8_key_parser ntsync i2c_dev
júl 14 14:38:34 cachyos-x8664 kernel:  crypto_user dm_mod loop nfnetlink lz4 zram 842_decompress 842_compress lz4hc_compress lz4_compress ip_tables x_tables nvme nvme_core nvme_keyring nvme_auth nvidia_drm(POE) drm_ttm_helper ttm nvidia_uvm(POE) nvidia_modeset(POE) video wmi nvidia(POE)
júl 14 14:38:34 cachyos-x8664 kernel: CR2: 0000000000000000
júl 14 14:38:34 cachyos-x8664 kernel: ---[ end trace 0000000000000000 ]---
júl 14 14:38:34 cachyos-x8664 kernel: RIP: 0010:nv_kthread_q_flush+0x9a/0x200 [nvidia_uvm]
júl 14 14:38:34 cachyos-x8664 kernel: Code: 24 18 48 c7 44 24 20 30 05 20 c7 4c 89 74 24 28 e8 db c7 bd da 48 89 c1 48 8b 44 24 10 4c 39 e0 0f 85 17 01 00 00 48 8b 73 08 <48> 39 1e 0f 85 d3 e1 0e 00 4c 39 e6 0f 84 ca e1 0e 00 48 89 74 24
júl 14 14:38:34 cachyos-x8664 kernel: RSP: 0018:ffffccf6dc11bc30 EFLAGS: 00010046
júl 14 14:38:34 cachyos-x8664 kernel: RAX: ffffccf6dc11bc40 RBX: ffffccf6dc48d2f0 RCX: 0000000000000246
júl 14 14:38:34 cachyos-x8664 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffccf6dc48d300
júl 14 14:38:34 cachyos-x8664 kernel: RBP: ffffccf6dc11bcb8 R08: ffffffffc741ecc0 R09: ffff890f1ef32000
júl 14 14:38:34 cachyos-x8664 kernel: R10: 0000000000000000 R11: 0000000000000004 R12: ffffccf6dc11bc40
júl 14 14:38:34 cachyos-x8664 kernel: R13: 0000000000000000 R14: ffffccf6dc11bc68 R15: ffffccf6dc11bc78
júl 14 14:38:34 cachyos-x8664 kernel: FS:  00007f2cb9ed8b80(0000) GS:ffff890f7b3d9000(0000) knlGS:0000000000000000
júl 14 14:38:34 cachyos-x8664 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
júl 14 14:38:34 cachyos-x8664 kernel: CR2: 0000000000000000 CR3: 00000002832f5000 CR4: 0000000000350ef0
júl 14 14:38:34 cachyos-x8664 kernel: note: nvidia-sleep.sh[4180] exited with irqs disabled
júl 14 14:38:34 cachyos-x8664 kernel: note: nvidia-sleep.sh[4180] exited with preempt_count 1
júl 14 14:38:34 cachyos-x8664 kernel: PM: suspend entry (deep)
júl 14 14:38:34 cachyos-x8664 systemd[1]: nvidia-suspend.service: Main process exited, code=killed, status=9/KILL
júl 14 14:38:34 cachyos-x8664 systemd[1]: nvidia-suspend.service: Failed with result 'signal'.
júl 14 14:38:34 cachyos-x8664 systemd[1]: Failed to start NVIDIA system suspend actions.
júl 14 14:38:34 cachyos-x8664 systemd[1]: Starting System Suspend...
júl 14 14:38:34 cachyos-x8664 systemd-sleep[4210]: User sessions remain unfrozen on explicit request ($SYSTEMD_SLEEP_FREEZE_USER_SESSIONS=0).
júl 14 14:38:34 cachyos-x8664 systemd-sleep[4210]: This is not recommended, and might result in unexpected behavior, particularly
júl 14 14:38:34 cachyos-x8664 systemd-sleep[4210]: in suspend-then-hibernate operations or setups with encrypted home directories.
júl 14 14:38:34 cachyos-x8664 systemd-sleep[4210]: Performing sleep operation 'suspend'...

This happened to me with driver 575.64.05 and an RTX4080 on Laptop.

Confirm that system suspend blocking bug is still exists in version 575.64.05

I’m starting to think that switching to an AMD card will be sooner than they fix this bug…

I tried it with LTS kernel (6.12.36-2-cachyos-lts-lto) and it’s pretty much the same bug.

júl 30 19:25:52 cachyos-x8664 systemd[1]: Reached target Sleep.
júl 30 19:25:52 cachyos-x8664 systemd[1]: Starting NVIDIA system suspend actions...
júl 30 19:25:52 cachyos-x8664 suspend[1908]: nvidia-suspend.service
júl 30 19:25:52 cachyos-x8664 logger[1908]: <13>Jul 30 19:25:52 suspend: nvidia-suspend.service
júl 30 19:25:52 cachyos-x8664 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
júl 30 19:25:52 cachyos-x8664 kernel: #PF: supervisor read access in kernel mode
júl 30 19:25:52 cachyos-x8664 kernel: #PF: error_code(0x0000) - not-present page
júl 30 19:25:52 cachyos-x8664 kernel: PGD 0 P4D 0 
júl 30 19:25:52 cachyos-x8664 kernel: Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
júl 30 19:25:52 cachyos-x8664 kernel: CPU: 6 UID: 0 PID: 1910 Comm: nvidia-sleep.sh Tainted: P           OE      6.12.36-2-cachyos-lts-lto #1 9f8f451d61eb73f4050a5c13e5ecd402fec0e541
júl 30 19:25:52 cachyos-x8664 kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
júl 30 19:25:52 cachyos-x8664 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350M Pro4, BIOS P6.60 07/27/2020
júl 30 19:25:52 cachyos-x8664 kernel: RIP: 0010:_raw_q_flush+0x88/0x110 [nvidia_uvm]
júl 30 19:25:52 cachyos-x8664 kernel: Code: 48 89 e0 48 89 44 24 38 e8 85 3b ba d5 48 8b 4c 24 20 4c 39 f9 75 67 48 8b 73 08 49 39 f7 0f 95 c1 49 39 df 74 6a 84 c9 74 66 <48> 39 1e 75 61 4c 89 7b 08 48 89 5c 24 20 48 89 74 24 28 4c 89 3e
júl 30 19:25:52 cachyos-x8664 kernel: RSP: 0018:ffffcf995b30fa18 EFLAGS: 00010002
júl 30 19:25:52 cachyos-x8664 kernel: RAX: 0000000000000296 RBX: ffffcf995b3a42f0 RCX: ffffcf995b30fa01
júl 30 19:25:52 cachyos-x8664 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffcf995b3a4300
júl 30 19:25:52 cachyos-x8664 kernel: RBP: 0000000000000000 R08: 0000000000000002 R09: ffffcf995b30fa28
júl 30 19:25:52 cachyos-x8664 kernel: R10: 0000000000000100 R11: 0000000000000000 R12: 00000000000002e8
júl 30 19:25:52 cachyos-x8664 kernel: R13: ffffcf995b3a4008 R14: ffffcf995b3a4300 R15: ffffcf995b30fa38
júl 30 19:25:52 cachyos-x8664 kernel: FS:  0000714e07659b80(0000) GS:ffff8f59def00000(0000) knlGS:0000000000000000
júl 30 19:25:52 cachyos-x8664 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
júl 30 19:25:52 cachyos-x8664 kernel: CR2: 0000000000000000 CR3: 000000080279c000 CR4: 0000000000350ef0
júl 30 19:25:52 cachyos-x8664 kernel: Call Trace:
júl 30 19:25:52 cachyos-x8664 kernel:  <TASK>
júl 30 19:25:52 cachyos-x8664 kernel:  ? __pfx__q_flush_function+0x10/0x10 [nvidia_uvm 294d8c316513357e1b01fe9197ce0ff51d38d2c6]
júl 30 19:25:52 cachyos-x8664 kernel:  nv_kthread_q_flush+0x18/0x70 [nvidia_uvm 294d8c316513357e1b01fe9197ce0ff51d38d2c6]
júl 30 19:25:52 cachyos-x8664 kernel:  uvm_suspend+0x17b/0x1a0 [nvidia_uvm 294d8c316513357e1b01fe9197ce0ff51d38d2c6]
júl 30 19:25:52 cachyos-x8664 kernel:  uvm_suspend_entry+0xb7/0xf0 [nvidia_uvm 294d8c316513357e1b01fe9197ce0ff51d38d2c6]
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? down+0x1c/0x50
júl 30 19:25:52 cachyos-x8664 kernel:  nv_uvm_suspend+0x32/0x50 [nvidia 70237cb0e918b7dcfdcb1eb90947afd3cb30cd9e]
júl 30 19:25:52 cachyos-x8664 kernel:  nv_set_system_power_state+0x319/0x4d0 [nvidia 70237cb0e918b7dcfdcb1eb90947afd3cb30cd9e]
júl 30 19:25:52 cachyos-x8664 kernel:  nv_procfs_write_suspend+0x129/0x160 [nvidia 70237cb0e918b7dcfdcb1eb90947afd3cb30cd9e]
júl 30 19:25:52 cachyos-x8664 kernel:  ? __pfx___traceiter_mmap_lock_start_locking+0x10/0x10
júl 30 19:25:52 cachyos-x8664 kernel:  proc_reg_write+0x68/0xb0
júl 30 19:25:52 cachyos-x8664 kernel:  vfs_write+0x10b/0x3f0
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? syscall_exit_to_user_mode+0x97/0xc0
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? do_syscall_64+0x94/0x170
júl 30 19:25:52 cachyos-x8664 kernel:  __x64_sys_write+0x7a/0xe0
júl 30 19:25:52 cachyos-x8664 kernel:  do_syscall_64+0x88/0x170
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? get_close_on_exec+0x37/0x40
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? do_fcntl+0x82/0x8f0
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? filp_close+0x74/0x80
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? do_dup2+0xd2/0x110
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? sched_clock_cpu+0x10/0x190
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? psi_group_change+0x43/0x330
júl 30 19:25:52 cachyos-x8664 kernel:  ? update_curr+0x2c1/0x3c0
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? finish_task_switch+0xc1/0x370
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? __schedule+0x597/0x1400
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? __rseq_handle_notify_resume+0xf8/0x580
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  ? srso_return_thunk+0x5/0x5f
júl 30 19:25:52 cachyos-x8664 kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
júl 30 19:25:52 cachyos-x8664 kernel: RIP: 0033:0x714e074a3d5f
júl 30 19:25:52 cachyos-x8664 kernel: Code: 04 00 00 00 48 8b 15 a8 af 16 00 64 89 02 48 c7 c2 ff ff ff ff 48 83 c4 10 48 89 d0 5b c3 0f 1f 44 00 00 48 8b 44 24 20 0f 05 <48> 63 d0 3d 00 f0 ff ff 77 0f 48 83 c4 10 48 89 d0 5b c3 66 0f 1f
júl 30 19:25:52 cachyos-x8664 kernel: RSP: 002b:00007ffe4c450a70 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
júl 30 19:25:52 cachyos-x8664 kernel: RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 0000714e074a3d5f
júl 30 19:25:52 cachyos-x8664 kernel: RDX: 0000000000000008 RSI: 00005c2d81e55480 RDI: 0000000000000001
júl 30 19:25:52 cachyos-x8664 kernel: RBP: 00005c2d81e55480 R08: 0000000000000000 R09: 0000000000000000
júl 30 19:25:52 cachyos-x8664 kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000008
júl 30 19:25:52 cachyos-x8664 kernel: R13: 0000714e076105c0 R14: 00005c2d81e55480 R15: 00005c2d4f61a460
júl 30 19:25:52 cachyos-x8664 kernel:  </TASK>
júl 30 19:25:52 cachyos-x8664 kernel: Modules linked in: tcp_dctcp tcp_htcp tcp_bbr rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace nfs_localio sunrpc netfs snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfkill nct6775 nct6775_core hwmon_vid vfat fat amd_atl intel_rapl_msr intel_rapl_common kvm_amd snd_hda_codec_realtek kvm snd_hda_scodec_component snd_hda_codec_generic snd_hda_codec_hdmi irqbypass crct10dif_pclmul snd_hda_intel crc32_pclmul polyval_clmulni snd_intel_dspcfg polyval_generic snd_intel_sdw_acpi ghash_clmulni_intel snd_hda_codec sha512_ssse3 sha1_ssse3 ee1004 aesni_intel snd_hda_core gf128mul snd_hwdep crypto_simd r8169 cryptd snd_pcm rapl wmi_bmof realtek mdio_devres snd_timer pcspkr i2c_piix4 snd k10temp acpi_cpufreq i2c_smbus ccp soundcore libphy mousedev joydev gpio_amdpt gpio_generic mac_hid ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables
júl 30 19:25:52 cachyos-x8664 kernel:  pkcs8_key_parser ntsync i2c_dev crypto_user dm_mod loop nfnetlink lz4 zram 842_decompress 842_compress lz4hc_compress lz4_compress bpf_preload ip_tables x_tables btrfs hid_generic libcrc32c crc32c_generic raid6_pq crc32c_intel xor sha256_ssse3 nvme nvme_core usbhid nvme_auth nvidia_drm(POE) drm_ttm_helper ttm nvidia_uvm(POE) nvidia_modeset(POE) video wmi nvidia(POE)
júl 30 19:25:52 cachyos-x8664 kernel: CR2: 0000000000000000
júl 30 19:25:52 cachyos-x8664 kernel: ---[ end trace 0000000000000000 ]---
júl 30 19:25:52 cachyos-x8664 kernel: RIP: 0010:_raw_q_flush+0x88/0x110 [nvidia_uvm]
júl 30 19:25:52 cachyos-x8664 kernel: Code: 48 89 e0 48 89 44 24 38 e8 85 3b ba d5 48 8b 4c 24 20 4c 39 f9 75 67 48 8b 73 08 49 39 f7 0f 95 c1 49 39 df 74 6a 84 c9 74 66 <48> 39 1e 75 61 4c 89 7b 08 48 89 5c 24 20 48 89 74 24 28 4c 89 3e
júl 30 19:25:52 cachyos-x8664 kernel: RSP: 0018:ffffcf995b30fa18 EFLAGS: 00010002
júl 30 19:25:52 cachyos-x8664 kernel: RAX: 0000000000000296 RBX: ffffcf995b3a42f0 RCX: ffffcf995b30fa01
júl 30 19:25:52 cachyos-x8664 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffcf995b3a4300
júl 30 19:25:52 cachyos-x8664 kernel: RBP: 0000000000000000 R08: 0000000000000002 R09: ffffcf995b30fa28
júl 30 19:25:52 cachyos-x8664 kernel: R10: 0000000000000100 R11: 0000000000000000 R12: 00000000000002e8
júl 30 19:25:52 cachyos-x8664 kernel: R13: ffffcf995b3a4008 R14: ffffcf995b3a4300 R15: ffffcf995b30fa38
júl 30 19:25:52 cachyos-x8664 kernel: FS:  0000714e07659b80(0000) GS:ffff8f59def00000(0000) knlGS:0000000000000000
júl 30 19:25:52 cachyos-x8664 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
júl 30 19:25:52 cachyos-x8664 kernel: CR2: 0000000000000000 CR3: 000000080279c000 CR4: 0000000000350ef0
júl 30 19:25:52 cachyos-x8664 kernel: note: nvidia-sleep.sh[1910] exited with irqs disabled
júl 30 19:25:52 cachyos-x8664 kernel: note: nvidia-sleep.sh[1910] exited with preempt_count 1
júl 30 19:25:52 cachyos-x8664 kernel: PM: suspend entry (deep)
júl 30 19:25:52 cachyos-x8664 systemd[1]: nvidia-suspend.service: Main process exited, code=killed, status=9/KILL
júl 30 19:25:52 cachyos-x8664 systemd[1]: nvidia-suspend.service: Failed with result 'signal'.
júl 30 19:25:52 cachyos-x8664 systemd[1]: Failed to start NVIDIA system suspend actions.
júl 30 19:25:52 cachyos-x8664 systemd[1]: Starting System Suspend...
júl 30 19:25:52 cachyos-x8664 systemd-sleep[1942]: User sessions remain unfrozen on explicit request ($SYSTEMD_SLEEP_FREEZE_USER_SESSIONS=0).
júl 30 19:25:52 cachyos-x8664 systemd-sleep[1942]: This is not recommended, and might result in unexpected behavior, particularly
júl 30 19:25:52 cachyos-x8664 systemd-sleep[1942]: in suspend-then-hibernate operations or setups with encrypted home directories.
júl 30 19:25:52 cachyos-x8664 systemd-sleep[1942]: Performing sleep operation 'suspend'...
júl 30 19:25:53 cachyos-x8664 kernel: Filesystems sync: 0.064 seconds
júl 30 19:25:53 cachyos-x8664 kernel: Freezing user space processes
júl 30 19:25:53 cachyos-x8664 kernel: Freezing user space processes completed (elapsed 0.001 seconds)
júl 30 19:25:53 cachyos-x8664 kernel: OOM killer disabled.
júl 30 19:25:53 cachyos-x8664 kernel: Freezing remaining freezable tasks
júl 30 19:25:53 cachyos-x8664 kernel: Freezing remaining freezable tasks completed (elapsed 0.000 seconds)
júl 30 19:25:53 cachyos-x8664 kernel: printk: Suspending console(s) (use no_console_suspend to debug)
júl 30 19:25:53 cachyos-x8664 kernel: NVRM: GPU 0000:07:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 'Configuring Power Management Support' section in the driver README.
júl 30 19:25:53 cachyos-x8664 kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
júl 30 19:25:53 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: pci_pm_suspend(): nv_pmops_suspend [nvidia] returns -5
júl 30 19:25:53 cachyos-x8664 kernel: sd 4:0:0:0: [sdc] Synchronizing SCSI cache
júl 30 19:25:53 cachyos-x8664 kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
júl 30 19:25:53 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: dpm_run_callback(): pci_pm_suspend.llvm.8834564979838540668 returns -5
júl 30 19:25:53 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: failed to suspend async: error -5
júl 30 19:25:53 cachyos-x8664 kernel: PM: Some devices failed to suspend, or early wake event detected
júl 30 19:25:53 cachyos-x8664 kernel: nvme nvme0: 12/0/0 default/read/poll queues
júl 30 19:25:53 cachyos-x8664 kernel: OOM killer enabled.
júl 30 19:25:53 cachyos-x8664 kernel: Restarting tasks ... done.
júl 30 19:25:53 cachyos-x8664 kernel: random: crng reseeded on system resumption
júl 30 19:25:53 cachyos-x8664 kernel: PM: suspend exit
júl 30 19:25:53 cachyos-x8664 kernel: PM: suspend entry (s2idle)
júl 30 19:25:53 cachyos-x8664 kernel: Filesystems sync: 0.018 seconds
júl 30 19:25:53 cachyos-x8664 kernel: Freezing user space processes
júl 30 19:25:53 cachyos-x8664 kernel: Freezing user space processes completed (elapsed 0.001 seconds)
júl 30 19:25:53 cachyos-x8664 kernel: OOM killer disabled.
júl 30 19:25:53 cachyos-x8664 kernel: Freezing remaining freezable tasks
júl 30 19:25:53 cachyos-x8664 kernel: Freezing remaining freezable tasks completed (elapsed 0.000 seconds)
júl 30 19:25:53 cachyos-x8664 kernel: printk: Suspending console(s) (use no_console_suspend to debug)
júl 30 19:25:53 cachyos-x8664 kernel: NVRM: GPU 0000:07:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 'Configuring Power Management Support' section in the driver README.
júl 30 19:25:53 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: pci_pm_suspend(): nv_pmops_suspend [nvidia] returns -5
júl 30 19:25:53 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: dpm_run_callback(): pci_pm_suspend.llvm.8834564979838540668 returns -5
júl 30 19:25:53 cachyos-x8664 kernel: nvidia 0000:07:00.0: PM: failed to suspend async: error -5
júl 30 19:25:53 cachyos-x8664 kernel: sd 4:0:0:0: [sdc] Synchronizing SCSI cache
júl 30 19:25:53 cachyos-x8664 kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
júl 30 19:25:53 cachyos-x8664 kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
júl 30 19:25:53 cachyos-x8664 kernel: ata9: SATA link down (SStatus 0 SControl 300)
júl 30 19:25:53 cachyos-x8664 kernel: ata10: SATA link down (SStatus 0 SControl 300)
júl 30 19:25:53 cachyos-x8664 kernel: ata6: SATA link down (SStatus 0 SControl 300)
júl 30 19:25:53 cachyos-x8664 kernel: PM: Some devices failed to suspend, or early wake event detected
júl 30 19:25:53 cachyos-x8664 kernel: OOM killer enabled.
júl 30 19:25:53 cachyos-x8664 kernel: Restarting tasks ... done.
júl 30 19:25:53 cachyos-x8664 kernel: random: crng reseeded on system resumption
júl 30 19:25:53 cachyos-x8664 kernel: PM: suspend exit
júl 30 19:25:53 cachyos-x8664 systemd-sleep[1942]: Failed to put system to sleep. System resumed again: Input/output error
júl 30 19:25:53 cachyos-x8664 kernel: ata10: SATA link down (SStatus 0 SControl 300)
júl 30 19:25:53 cachyos-x8664 kernel: ata9: SATA link down (SStatus 0 SControl 300)
júl 30 19:25:53 cachyos-x8664 kernel: ata6: SATA link down (SStatus 0 SControl 300)

Nvidia team? From the call traces, perhaps you MIGHT be able to suggest temporary workarounds to avoid this code path being hit? Kernel / driver flags to change? Features to temporarily sacrifice?

Still an issue in 580.76.05, GTX 960.

The most worrying aspect is that apparently the crash causes the machine to enter a HOT state: fans are whirring like crazy! There’s no telling if it’s just fans defaulting to 100% before fan control takes over, or actual CPU/GPU usage, but if it’s the latter, then this bug has a carbon footprint and is an item on affected users’ electricity bill.

1 Like

Still unfixed in 580.105.08.