I’m also on Acer Aspire 7 with Nvidia GeForce GTX 1050, but with ubuntu 21.10.
I was able to solve it, as the way @alex21975 and @hdaniel mentioned, but with nvidia-driver-460-server:
Nowt the computer will not stuck on boot after suspend
but if I do
sudo service nvidia-suspend status
it will show
“Unit nvidia-suspend.service could not be found.”
Solving it with NVIDIA Suspend fix
still shows me
"
nvidia-suspend.service - NVIDIA system suspend actions
Loaded: loaded (/etc/systemd/system/nvidia-suspend.service; enabled; vendor preset: enabled)
Active: inactive (dead)
"
and the logs shows
kernel: snd_hda_codec_hdmi hdaudioC1D0: Unable to sync register 0x7f0800. -5
kernel: snd_hda_intel 0000:01:00.1: can't change power state from D3cold to D0 (config space inaccessible)
I found a kind of workaround. When the screen wakes from sleep (but goes black), use CTRL + ALT + F2 to switch to a terminal (terminal shows on the screen in a few seconds) and CTRL + ALT + F1 or F7 (depending on the system) to switch back to the graphical session. The screen will then work normally again (until the next time it goes to sleep).
The 495 driver seems to work, although this might be because I’ve tinkered around a lot when trying to fix previous driver versions. But I guess that it is worth trying the update. I do include some of my settings below, as those might be useful if the 495 driver is not working for you.
Getting the conformation on the installed driver.
user@device:~$ nvidia-smi
Tue Nov 9 09:58:24 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44 Driver Version: 495.44 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro M1000M Off | 00000000:01:00.0 On | N/A |
| N/A 52C P8 N/A / N/A | 259MiB / 4043MiB | 22% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1320 G /usr/lib/xorg/Xorg 158MiB |
| 0 N/A N/A 2710 G /usr/lib/xorg/Xorg 97MiB |
+-----------------------------------------------------------------------------+
Getting some of my settings listed (I had changed some of these while trying to change the memory handling at hibernation), see above.
I’ve also looked at the hibernate, suspend and resume services, which seem to be inactive but loaded.
user@device:~$ sudo service nvidia-suspend status
● nvidia-suspend.service - NVIDIA system suspend actions
Loaded: loaded (/etc/systemd/system/nvidia-suspend.service; enabled; vendor preset: enabled)
Active: inactive (dead)
nov 09 09:52:10 device systemd[1]: Starting NVIDIA system suspend actions...
nov 09 09:52:10 device suspend[6680]: nvidia-suspend.service
nov 09 09:52:10 device logger[6680]: <13>Nov 9 09:52:10 suspend: nvidia-suspend.service
nov 09 09:52:11 device systemd[1]: nvidia-suspend.service: Succeeded.
nov 09 09:52:11 device systemd[1]: Finished NVIDIA system suspend actions.
nov 09 09:53:06 device systemd[1]: Starting NVIDIA system suspend actions...
nov 09 09:53:06 device suspend[7975]: nvidia-suspend.service
nov 09 09:53:06 device logger[7975]: <13>Nov 9 09:53:06 suspend: nvidia-suspend.service
nov 09 09:53:07 device systemd[1]: nvidia-suspend.service: Succeeded.
nov 09 09:53:07 device systemd[1]: Finished NVIDIA system suspend actions.
user@device:~$ sudo service nvidia-hibernate status
● nvidia-hibernate.service - NVIDIA system hibernate actions
Loaded: loaded (/etc/systemd/system/nvidia-hibernate.service; enabled; vendor preset: enabled)
Active: inactive (dead)
user@device:~$ sudo service nvidia-resume status
● nvidia-resume.service - NVIDIA system resume actions
Loaded: loaded (/etc/systemd/system/nvidia-resume.service; enabled; vendor preset: enabled)
Active: inactive (dead)
nov 09 09:52:39 device systemd[1]: Starting NVIDIA system resume actions...
nov 09 09:52:39 device suspend[7377]: nvidia-resume.service
nov 09 09:52:39 device logger[7377]: <13>Nov 9 09:52:39 suspend: nvidia-resume.service
nov 09 09:52:39 device systemd[1]: nvidia-resume.service: Succeeded.
nov 09 09:52:39 device systemd[1]: Finished NVIDIA system resume actions.
nov 09 09:54:07 device systemd[1]: Starting NVIDIA system resume actions...
nov 09 09:54:07 device suspend[8614]: nvidia-resume.service
nov 09 09:54:07 device logger[8614]: <13>Nov 9 09:54:07 suspend: nvidia-resume.service
nov 09 09:54:07 device systemd[1]: nvidia-resume.service: Succeeded.
nov 09 09:54:07 device systemd[1]: Finished NVIDIA system resume actions.
@generix weirdly enough, it didn’t only fix the reboot problem, but even the audio device pci problem was fixed. The second device had all ffs listed before. (I only have a single graphics card, but ever since the start the M1000M gets also recognized as a 940MX).
The only difference with my configuration (except for hardware) are the PreserveVideoMemoryAllocations and TemporaryFilePath params so the issue probably only occurs when PreserveVideoMemoryAllocations is disabled (0).
Hmm that might very well be related. I tried setting these because they were recommended above by @generix. But when I tried it in the past, with the 465 drivers, this didn’t help. Could you try changing these setting?
Please try follow the Arch manual for the TemporaryFilePath memory allocation, and see if you can come up with the same results that I did. I’m interested to see whether this makes a difference or not.
(Sorry for the late reply. The forum system didn’t allow me to send a third message in this topic unil now.)
I expected it would solve the issue but unfortunately it didn’t (also I had to update the initramfs for the modprobe settings to be applied at boot). Also, either way, I don’t see any errors in the kernel logs anymore, so maybe I’m now running into a different issue. (I’ve had issues with displayport before, but HDMI worked fine until 460.) Maybe you could try disabling PreserveVideoMemoryAllocations to see if the issue pops up again?
Okay, this was surprising to me, disabling PreserveVideoMemoryAllocation did not give any issues. Hibernation works fine for ‘Nvidia-only’, ‘On-demand’ and the ‘Intel’ mode. I do agree that this might point out that you run into a different kind of issue.
Anyway, on my side there is still enough other, and probably also related issues. I’ll have a look at them in the future, but at least they are less frustrating.
On-Demand mode gets really slow when I look at an external screen only. The mouse moves fine, but any interaction with the application seems to have a significant delay (1 to 3 seconds). (Not related to hibernation at all, but it makes the On-demand mode kinda useless for me).
Intel mode has issues detecting external screens after resuming from hibernation.
As a result, I’ll have to use the Nvidia mode for now.
I also tested the 495 driver, it is not useful in my case, since it does not support my card anymore.
So all currently maintained nvidia drivers either fail on my card with black screen on boot, or don’t support my card anymore, and I am locked-in to older Xorg and kernel versions.
There’s nothing to configure in terms of power management, as my first post outlined, the issue happens on my machine on boot, not after suspend or anything (even though it looks similar to the issues reported for these cases).
I have tested the following versions, which yield a black screen after a turning the backlight on and off several times directly on boot:
460.27.04
460.32.03
460.39
460.56
460.67
460.91.03
465.27
The following versions gave me an immediate black screen on boot:
470.42.01
470.94
All with the same error message (Failed to allocate display engine core DMA push buffer).
The last working version in my case is 455.45.01 which has several security issues by now and lacks support for any recent kernels and supported Xorg versions. Any update from nvidia on this bug is greatly appreciated.
Can you clarify whether this bug is the one affecting users after suspend only, or also about “black screen on cold boot” — or is this the same bug?
For those who get a black screen on boot in this thread from this regression, none of the currently supported drivers are usable anymore, and they are locked-in to nvidia drivers with security issues and lack of Xorg / kernel support.
It looks like I am also hitting this issue on desktop PC running Ubuntu 22.04 with NVIDIA driver 510.60.02 installed. Looking in the journal I see a spattering of the following messages that look related to the above:
Possible Related Log Entries
Apr 08 08:49:23 pkkid-desktop kernel: nvidia-modeset: ERROR: GPU:0: Display engine push buffer channel allocation failed: 0x65 (Call timed out [NV_ERR_TIMEOUT])
Apr 08 08:49:23 pkkid-desktop kernel: nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer
…
Apr 08 08:50:06 pkkid-desktop kernel: x86/cpu: SGX disabled by BIOS.
…
Apr 08 08:50:06 pkkid-desktop kernel: tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does not cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f flags 0x200] vs fed40080 f80
Apr 08 08:50:06 pkkid-desktop kernel: tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does not cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f flags 0x200] vs fed40080 f80
…
Apr 08 08:50:09 pkkid-desktop gnome-session-binary[1858]: GLib-GIO-CRITICAL: g_bus_get_sync: assertion ‘error == NULL || *error == NULL’ failed
Apr 08 08:50:09 pkkid-desktop gnome-session-binary[1858]: GLib-GIO-CRITICAL: g_bus_get_sync: assertion ‘error == NULL || *error == NULL’ failed
…
Apr 08 08:50:15 pkkid-desktop gdm-password][3125]: gkr-pam: unable to locate daemon control file
…
Apr 08 08:50:16 pkkid-desktop systemd[3142]: app-gnome-gnome\x2dkeyring\x2dpkcs11-3345.scope: Failed to add PIDs to scope’s control group: No such process
Apr 08 08:50:16 pkkid-desktop systemd[3142]: app-gnome-gnome\x2dkeyring\x2dpkcs11-3345.scope: Failed with result ‘resources’.
Apr 08 08:50:16 pkkid-desktop systemd[3142]: Failed to start Application launched by gnome-session-binary.
…
Apr 08 08:50:18 pkkid-desktop gnome-session-binary[1858]: WARNING: Lost name on bus: org.gnome.SessionManager
Apr 08 08:50:18 pkkid-desktop gnome-session[1858]: gnome-session-binary[1858]: WARNING: Lost name on bus: org.gnome.SessionManager
Apr 08 08:50:18 pkkid-desktop gdm-launch-environment][1651]: pam_unix(gdm-launch-environment:session): session closed for user gdm
Apr 08 08:50:18 pkkid-desktop gdm-launch-environment][1651]: GLib-GObject: g_object_unref: assertion ‘G_IS_OBJECT (object)’ failed
…
Apr 08 08:50:18 pkkid-desktop kernel: [drm:nv_drm_master_set [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to grab modeset ownership
Apr 08 08:50:31 pkkid-desktop kernel: [drm:nv_drm_master_set [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to grab modeset ownership
It looks like I should be trying to set the PreserveVideoMemoryAllocations=1 in /proc/driver/nvidia/params. I imaging editing this file directly is not how that is done. Can someone explain where I put this setting? I can report back if this fixes the blank screen when resuming from suspend.
Can you share any updates on this issue?
I am most interested in the problem already causing black screen on boot, i.e. making the drivers unusable.
Thanks in advance!
To reiterate, the latest version which worked fine was 455.45.01, all later versions do not work anymore but produce a black screen right on boot. Since this version is not updated for more recent kernels, I am currently bound to use 390.157.
Any news on the issue / regression are greatly appreciated. Thanks in advance!