Failed to come back from suspend / black screen on resume

Hi All,

After suspending my laptop (Lenovo P50), the machine will fail to restore the display, which requires me to restart the machine. This is a new behavior that I didn’t encounter in the past.

In journal log for the last boot, I see the follwing stack trace

Jun 21 10:13:53 P50 kernel: ------------[ cut here ]------------
Jun 21 10:13:53 P50 kernel: WARNING: CPU: 3 PID: 73712 at /tmp/akmodsbuild.KK8IEMuF/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.18.5-200.fc36.x86_64/nvidia/nv.c:3939 nv_restore_user_channels+0xc6/0xe0 [nvidia]
Jun 21 10:13:53 P50 kernel: Modules linked in: tun tls binfmt_misc rfcomm snd_seq_dummy snd_hrtimer overlay wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_syslog nft_log nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep>
Jun 21 10:13:53 P50 kernel:  intel_powerclamp coretemp kvm_intel iwlwifi snd_hda_intel snd_intel_dspcfg kvm snd_intel_sdw_acpi iwlmei snd_hda_codec irqbypass rapl snd_hda_core intel_cstate snd_hwdep cfg80211 snd_seq intel_uncore snd_seq_device think_lmi snd_pcm joydev pcspkr thinkpad_acpi intel_wmi_thunderbolt firmware_attributes_class ledtrig_audio mei_me wmi_bmof platform_profile i2c_i801 snd_timer i2c_smbus mei intel_pch_thermal rfkill snd soundcore dm_crypt uas usb_s>
Jun 21 10:13:53 P50 kernel: CPU: 3 PID: 73712 Comm: nvidia-sleep.sh Tainted: P           OE     5.18.5-200.fc36.x86_64 #1
Jun 21 10:13:53 P50 kernel: Hardware name: LENOVO 20EN001EUS/20EN001EUS, BIOS N1EET94W (1.67 ) 12/10/2021
Jun 21 10:13:53 P50 kernel: RIP: 0010:nv_restore_user_channels+0xc6/0xe0 [nvidia]
Jun 21 10:13:53 P50 kernel: Code: 61 29 de be 01 00 00 00 48 89 ef e8 74 96 00 00 4c 89 ef e8 7c 61 29 de ba 02 00 00 00 48 89 ee 48 89 df e8 ac 38 94 00 eb 94 <0f> 0b eb c6 41 bc 51 00 00 00 eb 9f 66 66 2e 0f 1f 84 00 00 00 00
Jun 21 10:13:53 P50 kernel: RSP: 0018:ffffb39a0391fd38 EFLAGS: 00010206
Jun 21 10:13:53 P50 kernel: RAX: 0000000000000003 RBX: ffff9f492eb8b000 RCX: 0000000000000000
Jun 21 10:13:53 P50 kernel: RDX: 0000000000000087 RSI: 0000000000000246 RDI: 00000000ffffffff
Jun 21 10:13:53 P50 kernel: RBP: ffff9f484b8f3000 R08: 0000000000000000 R09: 000000008040003d
Jun 21 10:13:53 P50 kernel: R10: 00000000fa83b2da R11: 0000000000000000 R12: 0000000000000003
Jun 21 10:13:53 P50 kernel: R13: ffff9f484b8f3000 R14: ffff9f484b8f3510 R15: 0000000000000000
Jun 21 10:13:53 P50 kernel: FS:  00007fe046e9c740(0000) GS:ffff9f57a20c0000(0000) knlGS:0000000000000000
Jun 21 10:13:53 P50 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 21 10:13:53 P50 kernel: CR2: 00005597e69fd9a8 CR3: 000000078b86c004 CR4: 00000000003706e0
Jun 21 10:13:53 P50 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 21 10:13:53 P50 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 21 10:13:53 P50 kernel: Call Trace:
Jun 21 10:13:53 P50 kernel:  <TASK>
Jun 21 10:13:53 P50 kernel:  nv_set_system_power_state+0x111/0x3d0 [nvidia]
Jun 21 10:13:53 P50 kernel:  nv_procfs_write_suspend+0xeb/0x140 [nvidia]
Jun 21 10:13:53 P50 kernel:  proc_reg_write+0x56/0xa0
Jun 21 10:13:53 P50 kernel:  ? preempt_count_add+0x44/0x90
Jun 21 10:13:53 P50 kernel:  vfs_write+0xb3/0x290
Jun 21 10:13:53 P50 kernel:  ksys_write+0x53/0xd0
Jun 21 10:13:53 P50 kernel:  do_syscall_64+0x5b/0x80
Jun 21 10:13:53 P50 kernel:  ? syscall_exit_to_user_mode+0x17/0x40
Jun 21 10:13:53 P50 kernel:  ? do_syscall_64+0x67/0x80
Jun 21 10:13:53 P50 kernel:  ? filp_close+0x58/0x70
Jun 21 10:13:53 P50 kernel:  ? do_dup2+0x89/0xc0
Jun 21 10:13:53 P50 kernel:  ? syscall_exit_to_user_mode+0x17/0x40
Jun 21 10:13:53 P50 kernel:  ? do_syscall_64+0x67/0x80
Jun 21 10:13:53 P50 kernel:  ? exc_page_fault+0x70/0x170
Jun 21 10:13:53 P50 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 21 10:13:53 P50 kernel: RIP: 0033:0x7fe046fa0c17
Jun 21 10:13:53 P50 kernel: Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Jun 21 10:13:53 P50 kernel: RSP: 002b:00007ffd66063578 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Jun 21 10:13:53 P50 kernel: RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007fe046fa0c17
Jun 21 10:13:53 P50 kernel: RDX: 0000000000000007 RSI: 000055d87aafba10 RDI: 0000000000000001
Jun 21 10:13:53 P50 kernel: RBP: 000055d87aafba10 R08: 0000000000000000 R09: 0000000000000073
Jun 21 10:13:53 P50 kernel: R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000007
Jun 21 10:13:53 P50 kernel: R13: 00007fe047097780 R14: 0000000000000007 R15: 00007fe0470929e0
Jun 21 10:13:53 P50 kernel:  </TASK>
Jun 21 10:13:53 P50 kernel: ---[ end trace 0000000000000000 ]---
Jun 21 10:13:53 P50 kernel: ------------[ cut here ]------------
Jun 21 10:13:53 P50 kernel: WARNING: CPU: 0 PID: 73712 at /tmp/akmodsbuild.KK8IEMuF/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.18.5-200.fc36.x86_64/nvidia/nv.c:4156 nv_set_system_power_state+0x331/0x3d0 [nvidia]
Jun 21 10:13:53 P50 kernel: Modules linked in: tun tls binfmt_misc rfcomm snd_seq_dummy snd_hrtimer overlay wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_syslog nft_log nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep>
Jun 21 10:13:53 P50 kernel:  intel_powerclamp coretemp kvm_intel iwlwifi snd_hda_intel snd_intel_dspcfg kvm snd_intel_sdw_acpi iwlmei snd_hda_codec irqbypass rapl snd_hda_core intel_cstate snd_hwdep cfg80211 snd_seq intel_uncore snd_seq_device think_lmi snd_pcm joydev pcspkr thinkpad_acpi intel_wmi_thunderbolt firmware_attributes_class ledtrig_audio mei_me wmi_bmof platform_profile i2c_i801 snd_timer i2c_smbus mei intel_pch_thermal rfkill snd soundcore dm_crypt uas usb_s>
Jun 21 10:13:53 P50 kernel: CPU: 0 PID: 73712 Comm: nvidia-sleep.sh Tainted: P        W  OE     5.18.5-200.fc36.x86_64 #1
Jun 21 10:13:53 P50 kernel: Hardware name: LENOVO 20EN001EUS/20EN001EUS, BIOS N1EET94W (1.67 ) 12/10/2021
Jun 21 10:13:53 P50 kernel: RIP: 0010:nv_set_system_power_state+0x331/0x3d0 [nvidia]
Jun 21 10:13:53 P50 kernel: Code: ff 48 8b ad 38 05 00 00 48 85 ed 74 3d 45 84 e4 74 e7 48 8b 85 70 02 00 00 89 da 48 8b 70 78 48 8b 78 60 e8 d1 d2 ff ff eb cf <0f> 0b e9 e1 fd ff ff 0f 0b 48 c7 c7 50 b7 22 c3 41 bd 51 00 00 00
Jun 21 10:13:53 P50 kernel: RSP: 0018:ffffb39a0391fd68 EFLAGS: 00010206
Jun 21 10:13:53 P50 kernel: RAX: 0000000000000003 RBX: 0000000000000002 RCX: 0000000080020001
Jun 21 10:13:53 P50 kernel: RDX: 0000000080020002 RSI: fffff7d547bae200 RDI: 0000000040000000
Jun 21 10:13:53 P50 kernel: RBP: ffff9f484b8f3000 R08: 0000000000000000 R09: 0000000080020001
Jun 21 10:13:53 P50 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Jun 21 10:13:53 P50 kernel: R13: 000000000000001f R14: 000055d87aafba10 R15: 0000000000000000
Jun 21 10:13:53 P50 kernel: FS:  00007fe046e9c740(0000) GS:ffff9f57a2000000(0000) knlGS:0000000000000000
Jun 21 10:13:53 P50 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 21 10:13:53 P50 kernel: CR2: 00007f4438a6db70 CR3: 000000078b86c004 CR4: 00000000003706f0
Jun 21 10:13:53 P50 kernel: Call Trace:
Jun 21 10:13:53 P50 kernel:  <TASK>
Jun 21 10:13:53 P50 kernel:  nv_procfs_write_suspend+0xeb/0x140 [nvidia]
Jun 21 10:13:53 P50 kernel:  proc_reg_write+0x56/0xa0
Jun 21 10:13:53 P50 kernel:  ? preempt_count_add+0x44/0x90
Jun 21 10:13:53 P50 kernel:  vfs_write+0xb3/0x290
Jun 21 10:13:53 P50 kernel:  ksys_write+0x53/0xd0
Jun 21 10:13:53 P50 kernel:  do_syscall_64+0x5b/0x80
Jun 21 10:13:53 P50 kernel:  ? syscall_exit_to_user_mode+0x17/0x40
Jun 21 10:13:53 P50 kernel:  ? do_syscall_64+0x67/0x80
Jun 21 10:13:53 P50 kernel:  ? filp_close+0x58/0x70
Jun 21 10:13:53 P50 kernel:  ? do_dup2+0x89/0xc0
Jun 21 10:13:53 P50 kernel:  ? syscall_exit_to_user_mode+0x17/0x40
Jun 21 10:13:53 P50 kernel:  ? do_syscall_64+0x67/0x80
Jun 21 10:13:53 P50 kernel:  ? exc_page_fault+0x70/0x170
Jun 21 10:13:53 P50 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 21 10:13:53 P50 kernel: RIP: 0033:0x7fe046fa0c17
Jun 21 10:13:53 P50 kernel: Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Jun 21 10:13:53 P50 kernel: RSP: 002b:00007ffd66063578 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Jun 21 10:13:53 P50 kernel: RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007fe046fa0c17
Jun 21 10:13:53 P50 kernel: RDX: 0000000000000007 RSI: 000055d87aafba10 RDI: 0000000000000001
Jun 21 10:13:53 P50 kernel: RBP: 000055d87aafba10 R08: 0000000000000000 R09: 0000000000000073
Jun 21 10:13:53 P50 kernel: R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000007
Jun 21 10:13:53 P50 kernel: R13: 00007fe047097780 R14: 0000000000000007 R15: 00007fe0470929e0
Jun 21 10:13:53 P50 kernel:  </TASK>
Jun 21 10:13:53 P50 kernel: ---[ end trace 0000000000000000 ]---
...
Jun 21 10:13:59 P50 kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000947d:0:0:407

# rpm -qa | ag nvidia
kmod-nvidia-5.17.13-300.fc36.x86_64-510.68.02-2.fc36.x86_64
kmod-nvidia-5.17.14-300.fc36.x86_64-510.68.02-2.fc36.x86_64
kmod-nvidia-5.18.5-200.fc36.x86_64-510.68.02-2.fc36.x86_64
xorg-x11-drv-nvidia-cuda-libs-510.68.02-2.fc36.x86_64
xorg-x11-drv-nvidia-libs-510.68.02-2.fc36.x86_64
xorg-x11-drv-nvidia-kmodsrc-510.68.02-2.fc36.x86_64
akmod-nvidia-510.68.02-2.fc36.x86_64
xorg-x11-drv-nvidia-power-510.68.02-2.fc36.x86_64
xorg-x11-drv-nvidia-510.68.02-2.fc36.x86_64
nvidia-settings-510.68.02-1.fc36.x86_64
nvidia-persistenced-510.68.02-1.fc36.x86_64
xorg-x11-drv-nvidia-cuda-510.68.02-2.fc36.x86_64
$ uname -a
Linux P50 5.18.5-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jun 16 14:51:11 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Related issues I found but was not able to resolve my issue

Any help will be appreciated,
Thanks

nvidia-bug-report.log.gz (296.0 KB)

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

Hi @generix,
Thank you for helping, I added the report to the top post.

Video memory reloading seems to be properly configured, yet crashes on resume. Please check if disabling the systemd units
nvidia-suspend.service
nvidia-resume.service
nvidia-hibernate.service
works around it.

If I disable those service, the laptop won’t enter into suspend.

Jun 22 17:48:24 P50 kernel: NVRM: GPU 0000:01:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to >
Jun 22 17:48:24 P50 kernel: nvidia 0000:01:00.0: PM: pci_pm_suspend(): nv_pmops_suspend+0x0/0x30 [nvidia] returns -5
Jun 22 17:48:24 P50 kernel: nvidia 0000:01:00.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -5
Jun 22 17:48:24 P50 kernel: nvidia 0000:01:00.0: PM: failed to suspend async: error -5
Jun 22 17:48:24 P50 kernel: PM: Some devices failed to suspend, or early wake event detected