Ubuntu 20.04 with nvidia-460 driver freezes randomly after resume from suspend/hibernate

xyapus · March 31, 2021, 9:36am

This happens every time - at random times after resuming from suspend/hibernate the system freezes and only power cycle helps.

I’ve found some info information about last freeze that happened in kern.log:

Mar 31 12:13:06 gingerblade kernel: [33692.874653] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [chrome:605216]
Mar 31 12:13:06 gingerblade kernel: [33692.874655] Modules linked in: uas usb_storage snd_usb_audio snd_usbmidi_lib wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2
s_x86_64 ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha libblake2s_generic rfcomm xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptabl
e_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc ccm aufs cmac algif_hash algif_skcipher af_alg overlay bnep binfmt_misc nls_iso8859_1 input_l
eds joydev snd_sof_pci snd_sof_intel_byt snd_hda_codec_realtek snd_sof_intel_ipc snd_sof_intel_hda_common snd_hda_codec_generic snd_soc_hdac_hda snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof snd_hda_ext_core snd_
soc_acpi_intel_match snd_soc_acpi ledtrig_audio snd_soc_core snd_hda_codec_hdmi snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel iwlmvm snd_intel_dspcfg mei_hdcp snd_hda_codec intel_rapl_msr
Mar 31 12:13:06 gingerblade kernel: [33692.874670]  snd_hda_core mac80211 snd_hwdep snd_pcm x86_pkg_temp_thermal intel_powerclamp snd_seq_midi coretemp libarc4 snd_seq_midi_event snd_rawmidi iwlwifi uvcvideo kvm
_intel snd_seq videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_seq_device videobuf2_common btusb kvm btrtl snd_timer btbcm videodev btintel rapl bluetooth snd dcdbas ucsi_ccg razerkbd(OE) intel_cstate dell
_wmi_descriptor efi_pstore intel_wmi_thunderbolt mc mei_me typec_ucsi wmi_bmof mxm_wmi ecdh_generic ee1004 8250_dw hid_multitouch cfg80211 mei ecc typec soundcore processor_thermal_device intel_rapl_common intel
_pch_thermal intel_soc_dts_iosf int3403_thermal int340x_thermal_zone mac_hid int3400_thermal acpi_pad acpi_thermal_rel nvidia_uvm(OE) acpi_tad sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 
dm_crypt usbhid hid_generic nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i2c_algo_bit aesni_intel drm_kms_helper crypto_simd cryptd
Mar 31 12:13:06 gingerblade kernel: [33692.874688]  glue_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_i801 cec thunderbolt i2c_smbus nvme rc_core nvme_core intel_lpss_pci drm intel_lpss r8169 idma64 
xhci_pci realtek virt_dma i2c_nvidia_gpu xhci_pci_renesas i2c_hid hid pinctrl_cannonlake wmi video pinctrl_intel
Mar 31 12:13:06 gingerblade kernel: [33692.874694] CPU: 2 PID: 605216 Comm: chrome Tainted: P           OE     5.8.0-48-generic #54~20.04.1-Ubuntu
Mar 31 12:13:06 gingerblade kernel: [33692.874695] Hardware name: Razer Blade 15 Base Model (Early 2020) - RZ09-0328/DA550, BIOS 01.06 09/16/2020
Mar 31 12:13:06 gingerblade kernel: [33692.874825] RIP: 0010:_nv032823rm+0x12/0x40 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.874826] Code: d2 0e 31 c0 e8 ef f1 78 ff e8 6a 40 eb ff 31 c0 48 83 c4 08 c3 0f 1f 00 48 83 ec 08 39 4a 10 76 17 48 8b 02 c1 e9 02 8b 04 88 <48> 83 c4 08 c3 66 0f 1f 84
 00 00 00 00 00 be 00 00 cb 09 bf 0a ad
Mar 31 12:13:06 gingerblade kernel: [33692.874827] RSP: 0018:ffffb35e86d47a70 EFLAGS: 00000246
Mar 31 12:13:06 gingerblade kernel: [33692.874828] RAX: 00000000166000a1 RBX: 0000000000000020 RCX: 0000000000000000
Mar 31 12:13:06 gingerblade kernel: [33692.874828] RDX: ffff8c75d011d098 RSI: ffff8c75d011c008 RDI: ffff8c75ce3c5408
Mar 31 12:13:06 gingerblade kernel: [33692.874829] RBP: ffff8c721db82a10 R08: 0000000000000020 R09: ffff8c721db82a28
Mar 31 12:13:06 gingerblade kernel: [33692.874829] R10: ffff8c75d011c008 R11: ffff8c75d011d098 R12: 0000000000000000
Mar 31 12:13:06 gingerblade kernel: [33692.874829] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8c75d011c008
Mar 31 12:13:06 gingerblade kernel: [33692.874830] FS:  00007f1c4eee6d80(0000) GS:ffff8c75d4280000(0000) knlGS:0000000000000000
Mar 31 12:13:06 gingerblade kernel: [33692.874831] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 31 12:13:06 gingerblade kernel: [33692.874831] CR2: 0000057ebc851e40 CR3: 00000007194e2004 CR4: 00000000007606e0
Mar 31 12:13:06 gingerblade kernel: [33692.874831] PKRU: 55555554
Mar 31 12:13:06 gingerblade kernel: [33692.874832] Call Trace:
Mar 31 12:13:06 gingerblade kernel: [33692.875010]  ? _nv009225rm+0x1f1/0x230 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.875167]  ? _nv036060rm+0x62/0x70 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.875324]  ? _nv028764rm+0x46/0x4a0 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.875468]  ? _nv009313rm+0x7b/0x90 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.875613]  ? _nv009309rm+0xfb/0x4f0 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.875757]  ? _nv037205rm+0xfd/0x180 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.875914]  ? _nv034425rm+0x248/0x370 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876002]  ? _nv009438rm+0x3d/0x90 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876161]  ? _nv029014rm+0x14c/0x670 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876247]  ? _nv028849rm+0x520/0x900 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876364]  ? _nv002523rm+0x9/0x20 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876477]  ? _nv003515rm+0x1b/0x80 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876590]  ? _nv012994rm+0x6fe/0x770 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876705]  ? _nv037995rm+0xb3/0x150 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876823]  ? _nv037994rm+0x388/0x4e0 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.876912]  ? _nv036246rm+0xbe/0x140 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877003]  ? _nv036247rm+0x42/0x70 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877094]  ? _nv008266rm+0x4b/0x90 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877155]  ? os_acquire_spinlock+0x12/0x20 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877255]  ? _nv000708rm+0x4ef/0x880 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877352]  ? rm_ioctl+0x54/0xb0 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877355]  ? __check_object_size+0x91/0x150
Mar 31 12:13:06 gingerblade kernel: [33692.877414]  ? nvidia_ioctl+0x66f/0x880 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877472]  ? nvidia_frontend_unlocked_ioctl+0x3b/0x50 [nvidia]
Mar 31 12:13:06 gingerblade kernel: [33692.877473]  ? ksys_ioctl+0x9d/0xd0
Mar 31 12:13:06 gingerblade kernel: [33692.877474]  ? __x64_sys_ioctl+0x1a/0x20
Mar 31 12:13:06 gingerblade kernel: [33692.877476]  ? do_syscall_64+0x49/0xc0
Mar 31 12:13:06 gingerblade kernel: [33692.877477]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

nvidia-bug-report.log.gz (447.6 KB)
hang.log.txt (6.6 KB)

Mart · March 31, 2021, 12:16pm

I see Xid 31 errors:

Mar 28 22:56:01 gingerblade kernel: [31277.892877] NVRM: Xid (PCI:0000:01:00): 31, pid=214, Ch 00000000, intr 00000000. MMU Fault: ENGINE HOST9 HUBCLIENT_HOST faulted @ 0x0_00009000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

Edit:
https://docs.nvidia.com/deploy/xid-errors/index.html#topic_5_2

but I cannot tell if they are related in general and in a timely fashion, because the journalctl output is limited to nvidia related messages only. You could check yourself with:
journalctl -bX, where X is to replace by 0 for the last boot and -1, etc for previous boots.

Seems chrome triggers this. Can you check if it happens if you turn off gpu acceleration in chrome?

Edit 2:
There should also be a newer driver version available from ubuntu, or the graphics-drivers PPA. Please also try that.

xyapus · March 31, 2021, 1:13pm

i wouldn’t blame chrome as this happened before and that time it was Telegram Desktop app… So i don’t really understand this suggestion - what should i do - not use GPU after resuming from suspend at all? Are you kidding me?

I’ve checked journalctl -b-1 - i can see the problem you mention Xid=31 happens at the moment system is resuming from sleep. I suppose the time is off because it has not been synced yet at that moment.

Mar 31 11:45:05 gingerblade kernel: CPU7 is up
Mar 31 11:45:05 gingerblade kernel: smpboot: Booting Node 0 Processor 8 APIC 0x5
Mar 31 11:45:05 gingerblade kernel: CPU8 is up
Mar 31 11:45:05 gingerblade kernel: smpboot: Booting Node 0 Processor 9 APIC 0x7
Mar 31 11:45:05 gingerblade kernel: CPU9 is up
Mar 31 11:45:05 gingerblade kernel: smpboot: Booting Node 0 Processor 10 APIC 0x9
Mar 31 11:45:05 gingerblade kernel: CPU10 is up
Mar 31 11:45:05 gingerblade kernel: smpboot: Booting Node 0 Processor 11 APIC 0xb
Mar 31 11:45:05 gingerblade kernel: CPU11 is up
Mar 31 11:45:05 gingerblade kernel: ACPI: Waking up from system sleep state S4
Mar 31 11:45:05 gingerblade kernel: ACPI Error: No handler for Region [VRTC] (000000008b578016) [SystemCMOS] (20200528/evregion-127)
Mar 31 11:45:05 gingerblade kernel: ACPI Error: Region SystemCMOS (ID=5) has no handler (20200528/exfldio-261)
Mar 31 11:45:05 gingerblade kernel: No Local Variables are initialized for Method [RTEC]
Mar 31 11:45:05 gingerblade kernel: No Arguments are initialized for method [RTEC]
Mar 31 11:45:05 gingerblade kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.EC0.RTEC due to previous error (AE_NOT_EXIST) (20200528/psparse-529)
Mar 31 11:45:05 gingerblade kernel: ACPI Error: Aborting method \RWAK due to previous error (AE_NOT_EXIST) (20200528/psparse-529)
Mar 31 11:45:05 gingerblade kernel: ACPI Error: Aborting method \_WAK due to previous error (AE_NOT_EXIST) (20200528/psparse-529)
Mar 31 11:45:05 gingerblade kernel: ACPI Error: AE_NOT_EXIST, While executing method \_WAK (20200528/hwesleep-47)
Mar 31 11:45:05 gingerblade kernel: ACPI: EC: interrupt unblocked
Mar 31 11:45:05 gingerblade kernel: pcieport 0000:00:1b.4: Intel SPT PCH root port ACS workaround enabled
Mar 31 11:45:05 gingerblade kernel: pcieport 0000:00:1b.0: Intel SPT PCH root port ACS workaround enabled
Mar 31 11:45:05 gingerblade kernel: pcieport 0000:00:1d.0: Intel SPT PCH root port ACS workaround enabled
Mar 31 11:45:05 gingerblade kernel: usb usb1: root hub lost power or was reset
Mar 31 11:45:05 gingerblade kernel: usb usb2: root hub lost power or was reset
Mar 31 11:45:05 gingerblade kernel: ACPI: EC: event unblocked
Mar 31 11:45:05 gingerblade kernel: usb usb5: root hub lost power or was reset
Mar 31 11:45:05 gingerblade kernel: usb usb6: root hub lost power or was reset
Mar 31 11:45:05 gingerblade kernel: nvme nvme0: 12/0/0 default/read/poll queues
Mar 31 11:45:05 gingerblade kernel: NVRM: GPU at PCI:0000:01:00: GPU-528d1d55-f343-fdc0-b25f-b5a49a4eac61
Mar 31 11:45:05 gingerblade kernel: NVRM: GPU Board Serial Number: 
Mar 31 11:45:05 gingerblade kernel: NVRM: Xid (PCI:0000:01:00): 31, pid=226, Ch 00000001, intr 00000000. MMU Fault: ENGINE HOST0 HUBCLIENT_HOST faulted @ 0x75_b8f74000. Fault is of type FAULT_PDE ACCESS_TYPE_VI>
Mar 31 11:45:05 gingerblade kernel: usb 1-8: reset full-speed USB device number 3 using xhci_hcd
Mar 31 11:45:05 gingerblade kernel: NVRM: Xid (PCI:0000:01:00): 31, pid=226, Ch 00000000, intr 00000000. MMU Fault: ENGINE HOST9 HUBCLIENT_HOST faulted @ 0xff_c0822000. Fault is of type FAULT_PDE ACCESS_TYPE_VI>
Mar 31 11:45:05 gingerblade kernel: usb usb3: root hub lost power or was reset
Mar 31 11:45:05 gingerblade kernel: usb usb4: root hub lost power or was reset
Mar 31 11:45:05 gingerblade kernel: usb 1-7: reset high-speed USB device number 2 using xhci_hcd
Mar 31 11:45:05 gingerblade kernel: restoring control 00000000-0000-0000-0000-000000000101/10/5
Mar 31 11:45:05 gingerblade kernel: restoring control 00000000-0000-0000-0000-000000000101/12/11
Mar 31 11:45:05 gingerblade kernel: usb 1-14: reset full-speed USB device number 4 using xhci_hcd
Mar 31 11:45:05 gingerblade kernel: acpi LNXPOWER:06: Turning OFF
Mar 31 11:45:05 gingerblade kernel: acpi LNXPOWER:02: Turning OFF
Mar 31 11:45:05 gingerblade kernel: PM: hibernation: Basic memory bitmaps freed
Mar 31 11:45:05 gingerblade kernel: OOM killer enabled.
Mar 31 11:45:05 gingerblade kernel: Restarting tasks ... done.
Mar 31 11:45:05 gingerblade kernel: mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
Mar 31 11:45:05 gingerblade kernel: thermal thermal_zone8: failed to read out thermal zone (-61)
Mar 31 11:45:05 gingerblade kernel: PM: hibernation: hibernation exit
Mar 31 11:45:05 gingerblade kernel: Bluetooth: hci0: Firmware revision 0.0 build 100 week 47 2019
Mar 31 11:45:04 gingerblade systemd-sleep[603034]: System resumed.

I can’t see any more recent drivers available than the ones i have. Could you point me please to the correct repo?

Mart · March 31, 2021, 4:11pm

No, how can you think of that? It was a testing suggestion because chrome was the trigger for the crash.

xyapus · March 31, 2021, 4:17pm

Ok, sorry i didn’t mean to be rude or something.

PPA repository shows:

nvidia-graphics-drivers-460 12 days ago

Which is i think is what i have installed:

$ dpkg -l | grep -E "\bnvidia-"
ii  nvidia-compute-utils-460                   460.39-0ubuntu0.20.04.1               amd64        NVIDIA compute utilities
ii  nvidia-cuda-dev                            10.1.243-3                            amd64        NVIDIA CUDA development files
ii  nvidia-cuda-doc                            10.1.243-3                            all          NVIDIA CUDA and OpenCL documentation
ii  nvidia-cuda-gdb                            10.1.243-3                            amd64        NVIDIA CUDA Debugger (GDB)
ii  nvidia-cuda-toolkit                        10.1.243-3                            amd64        NVIDIA CUDA development toolkit
ii  nvidia-dkms-460                            460.39-0ubuntu0.20.04.1               amd64        NVIDIA DKMS package
ii  nvidia-driver-460                          460.39-0ubuntu0.20.04.1               amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-460                   460.39-0ubuntu0.20.04.1               amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-460                   460.39-0ubuntu0.20.04.1               amd64        NVIDIA kernel source package
ii  nvidia-opencl-dev:amd64                    10.1.243-3                            amd64        NVIDIA OpenCL development files
ii  nvidia-prime                               0.8.16~0.20.04.1                      all          Tools to enable NVIDIA's Prime
ii  nvidia-profiler                            10.1.243-3                            amd64        NVIDIA Profiler for CUDA and OpenCL
ii  nvidia-settings                            460.39-0ubuntu0.20.04.1               amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-460                           460.39-0ubuntu0.20.04.1               amd64        NVIDIA driver support binaries
ii  nvidia-visual-profiler                     10.1.243-3                            amd64        NVIDIA Visual Profiler for CUDA and OpenCL
ii  screen-resolution-extra                    0.18build1                            all          Extension for the nvidia-settings control panel
ii  xserver-xorg-video-nvidia-460              460.39-0ubuntu0.20.04.1               amd64        NVIDIA binary Xorg driver

UPD:

 nvidia-graphics-drivers-460	460.67-0ubuntu0~0.18.04.1

Do you mean this minor version difference: 460.39 → 460.67 ?

Mart · March 31, 2021, 4:19pm

nvidia-graphics-drivers-460 460.67-0ubuntu0~0.20.04.1
That’s the newest on the ppa.

xyapus · March 31, 2021, 4:20pm

Ok, i’ll try that, thanks. I’ll get back to you after some time to tell if that helped or not

Mart · March 31, 2021, 5:50pm

Also I wonder, did you configure the power management like described here?:

https://download.nvidia.com/XFree86/Linux-x86_64/460.39/README/powermanagement.html

xyapus · March 31, 2021, 5:52pm

actually i didn’t know i have to configure anything specially about it, thanks for the link - i’ll check that

Mart · March 31, 2021, 5:53pm

It’s not a requirement. But worth a shot I’d say.

xyapus · March 31, 2021, 5:55pm

as far i can see - this should make sense only when lots of GPU RAM is being wired/allocated at the time of hibernate/suspend event, right? I have 8GB GPU RAM and only 1 GB is being used normally by Xorg, Chrome and some other apps.

Do you think i should try configuring second option using /proc/driver/nvidia/suspend ?

Mart · March 31, 2021, 6:05pm

Option one, which is the default (the one you should be using right now), is limited by functionality.
As said, I’d give it a try. Won’t hurt and is easily revertible. The only thing to watch out, is to have enough space on the drive/mountpoint.

Mart · March 31, 2021, 6:07pm

In general. You are hitting a crash. So all we do here is try. Only nvidia can analyze the crash dump, that came with your bug report.

xyapus · March 31, 2021, 6:08pm

ok, so i’ve updated to most recent driver from PPA, i’ll try configuring option #2 and hibernating later today - if it’ll crash again i’ll provide another nvidia-bug-report.log.gz. Can i expect it would be analyzed by Nvidia after that?

Mart · March 31, 2021, 6:12pm

I’d say - Hope dies last ;-)
However for increased attention you then could also send it to: linux-bugs@nvidia.com (with a reference to this topic).

aplattner · March 31, 2021, 6:37pm

There were some improvements I made in the 465.19.01 beta for suspend/resume with the power management stuff enabled. I know it might be difficult to test the beta if you’re using a PPA but would it be possible to give it a try?

Mart · March 31, 2021, 6:50pm

Ah, I read the changelog, but nothing I understood about suspend/resume improvements, except the automatic installation.

For xyapus:
If you want to give it a try, make sure you purge all nvida-driver ppa files (apt purge nvidia* libnvidia*), before installing via .run file. And stop the X server before installation (i.e. systemctl isolate multi-user-target).

aplattner · March 31, 2021, 7:01pm

There was a lot of intertwined behavior around VT switches and suspend/resume that I tried to untangle for the 465 series. All of it hinges off of the NVreg_PreserveVideoMemory=1 module parameter, which is still disabled by default in most cases. The suspend/hibernate/resume systemd units are required for the video memory preservation to function, which is why I made an effort to make the installer set those up automatically. If you’re using a PPA or other distribution packages, you’ll need to check with them to determine whether those systemd services are installed or enabled by default.

So the current state of things in 465.19.01 is that if you use the .run installer on a systemd distro, the only thing you’re supposed to need to do manually is enable NVreg_PreserveVideoMemory=1.

Mart · March 31, 2021, 7:14pm

Thank you for the explanation @aplattner !

@xyapus you find that driver and notes here for example: Current graphics driver releases

xyapus · April 1, 2021, 6:04am

While using 460.67-0ubuntu0~0.20.04.1 i tried manually following this Configuring Power Management Support guide and installed required systemd services. I’ve set up /tmp to use tmpfs of proper size using the /etc/systemd/system/tmp.mount so:

$ mount | grep /tmp
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=10485760k)

I cannot resume from hibernate when NVreg_PreserveVideoMemoryAllocations=1

Apr 01 08:38:39 gingerblade kernel: PM: hibernation: Read 5139808 kbytes in 4.57 seconds (1124.68 MB/s)
Apr 01 08:38:39 gingerblade kernel: PM: Image successfully loaded
Apr 01 08:38:39 gingerblade kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Apr 01 08:38:39 gingerblade kernel: NVRM: GPU 0000:01:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 
'Configuring Power Management Support' section in the driver README.
Apr 01 08:38:39 gingerblade kernel: PM: pci_pm_freeze(): nv_pmops_freeze+0x0/0x20 [nvidia] returns -5
Apr 01 08:38:39 gingerblade kernel: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xc0 returns -5
Apr 01 08:38:39 gingerblade kernel: PM: Device 0000:01:00.0 failed to quiesce async: error -5
Apr 01 08:38:39 gingerblade kernel: PM: hibernation: Failed to load image, recovering.
Apr 01 08:38:39 gingerblade kernel: PM: hibernation: Basic memory bitmaps freed
Apr 01 08:38:39 gingerblade kernel: PM: hibernation: resume failed (-5)

Am i missing something?

i’ve also tried other TemporaryFilePath locations as the doc states that

To achieve the best performance, file system types other than tmpfs are recommended at this time.

So i changed to NVreg_TemporaryFilePath=/tmp.nvidia and created dir /tmp.nvidia - but still can’t restore from hibernation with the same error as above.

@aplattner from the changelog it’s not clear to me if anything regarding my problem has been changed in the driver itself between v460 and v465. I can see that the systemd units installation are now automated, but i’ve managed to do it manually, so do i still have to go with that beta? Honestly i’m not too comfortable with betas…

Topic		Replies	Views
Kernel 5.6: system freeze when resuming from suspend or hibernate Linux	27	8151	August 18, 2021
565.57.01 won't resume from "suspend to RAM" Linux	35	4986	May 18, 2025
PreserveVideoMemoryAllocations + systemd services causes resume from hibernate to fail Linux	17	8392	October 8, 2025
Resume issue after suspend Ubuntu 20.04 Linux	3	4120	November 12, 2021
resume from suspend freezes system (GTX 970, Arch Linux, Kernel 4.4/4.7, NVIDIA 370) Linux	171	60348	June 18, 2017
Problem with resume from suspend (Ubuntu 16.04, GT 940MX) Linux	171	62748	August 2, 2021
Black screen when resuming systemctl-suspend, using nvidia-driver-470.57.02 with kernel 5.8.0-63-generic on GTX 970, xubuntu 20.04 LTS Linux	66	28328	February 17, 2022
[Regression 460 series] Black screen on boot: nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer Linux	64	23946	January 7, 2024
Suspend fails on 570.124.04 drivers Linux kernel	1	1302	April 26, 2025
[FIXED] Suspend / Resume issues with the driver version 470 Linux	9	56998	October 15, 2021

Ubuntu 20.04 with nvidia-460 driver freezes randomly after resume from suspend/hibernate

Related topics