4070 / 555 and 560 drivers wont stay in D3cold, Lenovo Legion Slim 5

Hi,

I’m experiencing issues with a Lenovo Legion Slim 5 16APH8 with 4070 not staying in D3cold after something wakes it up.

The issue experienced is, system boots up and shortly after enters D3cold and stays there, this is expected, however if anything wakes the dGPU the system will never stay in D3cold after this. The observed behavior is, it will stay in D0 for 20 seconds, enter D3cold for 1 second and go back to D0 and keep on repeating this indefinitely.

If I suspend the laptop with “systemctl suspend” and then wake it, the system will stay in D3cold until I run something that again wakes the dGPU, but again will loop 20 seconds in D0 and then 1 second in D3cold indefinitely.

Even if I boot to a non-graphical environment via multi-user.target, I can see the dGPU go to D3cold, and then if I wake it by running “lspci”, it will have the above issue of 20 seconds in D0, followed by 1 second in D3cold and loop like this indefinitely.

> uname -a
Linux 6.10.3-arch1-2 #1 SMP PREEMPT_DYNAMIC Tue, 06 Aug 2024 07:21:19 +0000 x86_64 GNU/Linux
> /sys/bus/pci/devices/0000:01:00.0/power/control
auto
> /proc/driver/nvidia/gpus/0000\:01\:00.0/power
Runtime D3 status:          Enabled (fine-grained)
Video Memory:               Active

GPU Hardware Support:
 Video Memory Self Refresh: Supported
 Video Memory Off:          Supported

S0ix Power Management:
 Platform Support:          Supported
 Status:                    Disabled
> sudo pacman -Q | grep nvidia
nvidia-beta-dkms 560.31.02-1
nvidia-prime 1.0-5
nvidia-utils-beta 560.31.02-1

I’m using the latest BIOS: M3CN43WW

It seems like a bug, to ensure it wasn’t a configuration change I have made somewhere, I fresh installed Arch and made no changes to configuration files while testing.

It seems like a bug similar to this: NVIDIA GPU Fails to power off (PRIME) Razer Blade 14 2022 - #37 by PabloB

I’ve tried NVreg_DynamicPowerManagementVideoMemoryThreshold=0 and “NVreg_DynamicPowerManagement=0x02”, however these had no impact on the issue.
nvidia-bug-report.log.gz (704.2 KB)

Attached is the NVIDIA bug report.

1 Like

EDIT: I’ve found a solution!

Just an update on this, the following udev rule causes D3cold to never be entered after boot for both 555.58.02 and 560.31.02.

# Remove NVIDIA Audio devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

The following allow the 555 driver to enter D3cold after boot, but are not required with the 560 driver to enter D3cold:

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"
Now on kernel:
Linux 6.10.4-arch2-1 #1 SMP PREEMPT_DYNAMIC Sun, 11 Aug 2024 16:19:06 +0000 x86_64 GNU/Linux

Solution was to use 555.58.02 with the above two udev rules and also:

options nvidia_drm modeset=1 fbdev=1
options nvidia NVreg_DynamicPowerManagementVideoMemoryThreshold=0

With NVreg_DynamicPowerManagementVideoMemoryThreshold=0 the dGPU now goes back to D3cold after something that was using the dGPU closes!!!

This has taken me a week to find a solution to, multiple reinstalls, multiple drivers, multiple kernel versions, thousands of combinations for config changes.

I hope this helps someone who is having dGPU D3cold issues on Legion Slim 5 16aph8.

Hey @nathan_tas!

Could you please share more details about the solution? Are those only two things you’ve changed after fresh install? I’m trying to reproduce steps but it does not work for me - I’m having the same Legion16APH8 laptop with 4070.

Did you enabled any systemctl nvidia daemons? Have you modified /etc/mkinitcpio.conf file?

@raven2555
i might help (same problem on an rtx 3050 dell laptop latest 560 driver nvidia-open because the closed one nevr worked at all ) :(
check this https://download.nvidia.com/XFree86/Linux-x86_64/435.17/README/dynamicpowermanagement.html (it will help you figure config locations ) also this will help you use your intel igpu for everything and nvidia only when asked (work ony with intel) :
forcing all the environment variables i can to the internal gpu

export DXVK_FILTER_DEVICE_NAME="Intel"
export VKD3D_FILTER_DEVICE_NAME="Intel"
export MESA_VK_DEVICE_SELECT="8086:a788"
export __GLX_VENDOR_LIBRARY_NAME="mesa"
export __EGL_VENDOR_LIBRARY_FILENAMES="/usr/share/glvnd/egl_vendor.d/50_mesa.json"
export __NV_PRIME_RENDER_OFFLOAD="0"
export __VK_LAYER_NV_optimus="non_NVIDIA_only"
export LIBVA_DRIVER_NAME="iHD"
export VDPAU_DRIVER="va_gl"
export WLR_RENDER_DRM_DEVICE="/dev/dri/renderD128"

and then in my own “prime-run” script in /usr/bin/prime-run (add it alsso to your shell variables )

export GBM_BACKEND="nvidia-drm"
export DXVK_FILTER_DEVICE_NAME="NVIDIA"
export VKD3D_FILTER_DEVICE_NAME="NVIDIA"
export __EGL_VENDOR_LIBRARY_FILENAMES="/usr/share/glvnd/egl_vendor.d/10_nvidia.json"
export VK_ICD_FILENAMES="/usr/share/vulkan/icd.d/nvidia_icd.json"
export VK_LOADER_DRIVERS_SELECT="*nvidia*"
export __NV_PRIME_RENDER_OFFLOAD="1"
export __VK_LAYER_NV_optimus="NVIDIA_only"
export __GLX_VENDOR_LIBRARY_NAME="nvidia"
unset MESA_VK_DEVICE_SELECT
"$@"

still suffering from the same problem hopefully we can keep this alive until a response

@raven2555 switching to
/etc/modprobe.d/nvidia.conf

options nvidia "NVreg_DynamicPowerManagement=0x03"
options nvidia NVreg_OpenRmEnableUnsupportedGpus=0
options nvidia NVreg_EnableGpuFirmware=1
options nvidia_drm modeset=1 fbdev=1
options nvidia NVreg_DynamicPowerManagementVideoMemoryThreshold=0
options nvidia NVreg_PreserveVideoMemoryAllocations=1 NVreg_TemporaryFilePath=/var/tmp

for /lib/udev/rules.d/80-nvidia-pm.rules i use the one used in the nvidia docs (actually removing the audio card will not change anything ) also from this (Reddit - Dive into anything ) you can (‘sudo rm -f /usr/share/glvnd/egl_vendor.d/10_nvidia.json’ if you use vulkan (in case you need it back reinstall nvidia-utils )
it should solve the problem for now (i will keep this updated )

Hi @raven2555 ,

From a fresh install of Arch using the latest version of Plasma with Wayland, the following is ALL that is required to get D3cold to function on a 16APH8 with latest BIOS. NOTE: Sometimes if I change performance profiles with FN+Q, I’ll find if I power up laptop it won’t go into D3cold, a reboot fixes this. Sometimes this happens from a cold boot with laptop off as well, a reboot fixes this.

Other advice I have found is incorrect and will result in a broken D3 state where the dGPU will not return to D3cold after it is accessed.

For working D3cold on dGPU, the following is required, functioning on Kernel 6.10.4-arch2-1 with Nvidia 555.58.02 drivers

(I need lib32-nvidia-utils for steam games, uncomment multilib in /etc/pacman.cfg)

sudo pacman -S nvidia-dkms nvidia-util nvidia-prime lib32-nvidia-utils

/etc/modprobe.d/nvidia.conf

options nvidia_drm modeset=1 fbdev=1
options nvidia NVreg_DynamicPowerManagementVideoMemoryThreshold=0

/etc/udev/rules.d/nvidia.rules

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

Enable NVIDIA power management

sudo systemctl enable --now nvidia-powerd

Need for CPU power management

# CPU power management in KDE.
sudo pacman -S power-profiles-daemon python-gobject
systemctl enable --now power-profiles-daemon

And that’s it, at 55% brightness 60Hz on battery I idle at less that 7W when power profile is set to power save in KDE. While watching a 1080p Youtube video, I’m sitting at roughly 10W.

I use the follow in KDE to automatically switch from 240Hz to 60Hz when off and on battery:

The following commands to switch refresh rate as above in the screenshot:

kscreen-doctor output.eDP-1.mode.2560x1600@240
kscreen-doctor output.eDP-1.mode.2560x1600@60

Note that sometimes eDP-1 can be different, find the correct output via:

kscreen-doctor -o

That’s it, fully functioning D3cold with 555 driver, any other config changes will break this, I’ve tried so many different things over a week long period that I lost track of config file changes. Note other software such as optimus-manager and TLP break this and leave config modifications lying around.

I wrote some C code (quick and dirty, don’t judge it) to use with the Command Output plasma widget which I have in my tray telling me what is going on with D3 states and also battery discharge/charge.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    FILE *fp = fopen("/sys/class/power_supply/ACAD/online","r");
    float bat;
    fscanf(fp, "%f", &bat);
    if (fp != NULL) {
        printf("%c", bat?'+':'-');
        fclose(fp);
    }
    else
        printf("ERROR reading battery");

    fp = NULL;
    fp = fopen("/sys/class/power_supply/BAT1/power_now","r");
    fscanf(fp, "%f", &bat);
    if (fp != NULL) {
        printf("%.2fW ", bat/1000000);
        fclose(fp);
    }
    else
        printf("ERROR reading battery");
    fp = NULL;
    fp = fopen("/sys/class/drm/card0/device/power_state","r");
    if (fp != NULL) {
        char videoStatus[1024];
        fscanf(fp,"%s",&videoStatus);
        printf("%s / ", videoStatus);
        fclose(fp);
    }
    else {
        printf("dGPU\n");
    }
    fp = NULL;
    fp = fopen("/sys/class/drm/card1/device/power_state","r");
    if (fp != NULL) {
        char videoStatus[1024];
        fscanf(fp,"%s",&videoStatus);
        printf("%s\n", videoStatus);
        fclose(fp);
    }
    else {
        printf("dGPU\n");
    }
    return 0;
}

gcc d3info.c -o d3info

@meduk0 I’ve seen those solutions on the internet but it didn’t worked. Anwyay thanks fo trying to look at it.

@nathan_tas I’ll try to do it with clean install in a day or two. On my current system unfortunately it does not work, but I’ve been messing with settings for over 2 weeks now so something can be breaking it.

I can now confirm this works for 560.31.02 from AUR.

pacman -Q | grep nvidia
lib32-nvidia-utils-beta 560.31.02-1
nvidia-beta-dkms 560.31.02-1
nvidia-prime 1.0-5
nvidia-utils-beta 560.31.02-1

Though I’m not convinced 560 is more performant than 555, I notice some frame hitching in a DX9 game with 560 that was buttery smooth with 555.

I’ve managed to cleanup all of leftovers and it works! At least partially.
Finally the gpu does not wake up and suspend all the way round if any app wakes it up. After waking up 4070 can sleep properly. Finally the battery lasts for a resonable time and laptop does not heat excesively. Got ~15W usage while writing this post - not as low as you described but definitely better than 35+

What I noticed is that after unplugging external monitor the gpu never even tries to go to sleep again - in that case reboot is necessary to allow it again.
Sometimes after boot it’s looped in constant sleep/wake like before but suspending laptop and waking it up fixes this and then gpu can properly enter D3Cold again.

Also If I suspend laptop and then close the lid - the laptop wakes up for a bit and then turns off. Journalctl reports stacktrace similar to this one 555.58.02-10 nvidia-open driver crash

Could you try to check if above happens also in your case?

I’m using pretty much same config as you Arch + Plasma. I did the testing on 555-dkms and 560-dkms beta from aur combined with linux-zen kernel.

Hi @raven2555,

With regards to the power consumption, that 15W was about what I was using before I installed the power profile manager:

sudo pacman -S power-profiles-daemon python-gobject
sudo systemctl enable --now power-profiles-daemon

Without installing the above I’d idle around 18W.

kscreen-doctor output.eDP-1.mode.2560x1600@60
kscreen-doctor output.eDP-1.mode.2560x1600@240

I put the above into the When entering “On Battery” state and When exiting “On Battery” state.

Make sure you run kscreen-doctor -o to find out if eDP-1 is your laptop screen, otherwise it wont switch refresh rate.

These are my settings in KDE for when on battery:

I have the 240Hz panel, so I switch from 240Hz down to 60Hz, this saves some power.

My 240Hz panel is 500nits, so it’s plenty bright at 55% brightness so I switch to that automatically as well.

Under “Switch to power profile” I have Power Save, this by far reduces the most power consumption. - You need power-profiles-daemon above installed to have this ability.

Just tested yes, both of those, external monitor to HDMI on our machines connects directly to NVIDIA GPU as you probably already know. So looks like a bug as it doesn’t even try to enter back into D3cold just stays in D0.

Same issue when suspend and then close the lid, can hear GPU power up and machine looks like it has done a full EC reset as hitting the power button the keyboard backlights are now on and performance profile is set to “Balanced” and not “Quiet” mode. Another bug.

1 Like

Yeah. I got those settings as well as the same screen 500nit 240hz. Right after boot with desktop only it idles at around ~10W. CPU I got is R7 7840HS.

Might be differences in wireless cards, bluetooth or maybe a different revision in LCD panel. Are you using the latest BIOS M3CN43WW? There is a CPU performance change between 42 and 43.

Yeah, upgraded to newest bios, but still the same.

Should be d3cold state listed in supported modes? I’ve got the following from the dmesg:

└─[0]»»»» sudo dmesg | grep "0000:01:00.0"
[    0.650438] pci 0000:01:00.0: [10de:249d] type 00 class 0x030000 PCIe Legacy Endpoint
[    0.650445] pci 0000:01:00.0: BAR 0 [mem 0x81000000-0x81ffffff]
[    0.650452] pci 0000:01:00.0: BAR 1 [mem 0x6000000000-0x61ffffffff 64bit pref]
[    0.650458] pci 0000:01:00.0: BAR 3 [mem 0x6200000000-0x6201ffffff 64bit pref]
[    0.650462] pci 0000:01:00.0: BAR 5 [io  0x4000-0x407f]
[    0.650465] pci 0000:01:00.0: ROM [mem 0x82000000-0x8207ffff pref]
[    0.650513] pci 0000:01:00.0: PME# supported from D0 D3hot
[    0.697413] pci 0000:01:00.0: vgaarb: bridge control possible
[    0.697413] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    0.718974] pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
[   16.939957] nvidia 0000:01:00.0: enabling device (0000 -> 0003)
[   16.940113] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[   18.898993] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
[   18.899865] nvidia 0000:01:00.0: [drm] Cannot find any crtc or sizes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.