PCI-Express runtime D3 power management broken by commit 4d03e3cc59828(?)

Commit 4d03e3cc59828c82ee89ea6e27a2f3cdf95aaadf (“fs: don’t allow kernel reads and writes without iter ops”) changed the semantics of kernel_read(). The NVIDIA driver relies(?) on reading the PCI device’s configuration to suspend(?) the device in nv_indicate_idle()(?). This results in the following kernel warnings:

kernel read not supported for file pci0000:00/0000:00:01.0/0000:01:00.0/config (pid: ... comm: ...)

And the device is not suspended. As a workaround, running cat "/sys/bus/pci/devices/0000:01:00.0/config" > /dev/null seems to achieve the desired effect on the device’s state, and it is suspended after that. (Alternatively, dd if="/sys/bus/pci/devices/0000:01:00.0/config" bs=1 count=1 of=/dev/null mimics the behaviour of the NVIDIA driver better.)


Kernel version: 5.10.4-1-MANJARO
NVIDIA driver version: 455.45.01

When are you supposed to actually run that cat statement? Every time you suspend? During boot?

I believe whenever you see the warning in the kernel message buffer, because that’s when the driver wants to trigger suspension of the GPU - I think -, but that will fail, so we can run cat from userspace to achieve basically the same effect - I think -; but I didn’t test it extensively.

By the way, another thing I noticed is nvidia-powerd (from 460.27.04). Curiously, it contains the following string: %s/%04x:%02x:%02x.%1u/config, which I’m pretty sure is exactly the path of the PCI device’s configuration in sysfs. So I think it’s not outlandish assumption that nvidia-powerd is what NVIDIA developers will provide as the solution to this kernel change.


It also has other interesting strings:

Error reading msr
modprobe msr
Error opening msr
/sys/bus/pci/devices
%s/%04x:%02x:%02x.%1u/config
%04x:%02x:%02x.%1u
/sys/bus/pci/rescan
%s/%04x:%02x:%02x.%1u/..
/sys/bus/pci/devices/%04x:%02x:%02x.%1u/rescan
/sys/devices/system/memory/block_size_bytes
/sys/devices/system/node/node%d/meminfo
/proc/modules
PATH=/sbin
/sys/devices/soc0/family
/proc/sys/kernel/modprobe
/proc/driver/nvidia/params
/proc/devices

Hm I’ll do a quick test if any of those (cat or powerd) will revive my D3 state

Do you know how to invoke nvidia-powerd? If you just run it it just exits…

des. 31 15:29:35 latentcall nvidia-powerd[1479]: nvidia-powerd version:1.0(build 1)
des. 31 15:29:35 latentcall nvidia-powerd[1479]: SBIOS support not found for NVPCF
des. 31 15:29:35 latentcall nvidia-powerd[1479]: No matching GPU found
des. 31 15:29:35 latentcall nvidia-powerd[1479]: Failed to initialize RM Client

I have no idea, it doesn’t even have a --help . I did not run it.

I ended up reverting to 5.9.16. I can see it’s been a while since I’ve checked this power saving thing out though - do you know what this output means? :P This is on my Dell XPS 15 9570, newest BIOS (1.17.1), and booted with acpi_osi="!Windows 2017" (selected that OSI for removal after a bit of trial and error, e.g. acpi_osi=! acpi_osi=“Windows 2015” or 2013 or 2017 all did give status “Not supported” on every “?” here)

# cat /proc/driver/nvidia/gpus/0000:01:00.0/power           
Runtime D3 status:          ?
Video Memory:               ?

GPU Hardware Support:
Video Memory Self Refresh: ?
Video Memory Off:          ?

With this config, prime render offload does not seem to work though. Bah…

Are there some test I can do to check if D3 works or not?

I can only get PRIME Render offload working when booting with acpi_osi="!Windows 2013". With that booted, the output looks like this:

# cat /proc/driver/nvidia/gpus/0000:01:00.0/power           
Runtime D3 status:          Not supported
Video Memory:               Active

GPU Hardware Support:
Video Memory Self Refresh: Not Supported
Video Memory Off:          Not Supported

Arent Dell 15 XPS 9570 supposed to support D3? I’m on deep water here…

My configuration:

$ cat /proc/driver/nvidia/gpus/0000\:01\:00.0/power 
Runtime D3 status:          Enabled (fine-grained)
Video Memory:               Active

GPU Hardware Support:
 Video Memory Self Refresh: Supported
 Video Memory Off:          Supported
$ cat /etc/modprobe.d/mhwd-gpu.conf 
[...]
options nvidia "NVreg_DynamicPowerManagement=0x02"
$ cat /etc/udev/rules.d/90-mhwd-prime-powermanagement.rules
[...]
# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

Have you tried booting without specifying acpi_osi? I think if it works without that, you should definitely avoid specifying acpi_osi.

Thank you so much for that udev config, that was a much better solution that my script in /etc/local.d!

After a bit more research (according to this blog post: https://manukyan.dev/posts/2019-11-23-nvidia-optimus-on-linux/), it seems like D3 power state for nvidia optimus graphics is supported first with Turing graphics. My Dell XPS 15 9570 has a Pascal based card (GTX 1050Ti)…

How can I actually turn off the card with this setup? On my previous XPSes like XPS 9550 you could use bbswitch, but I don’t think that works anymore (does not use ACPI calls anymore).

And, you’re right, it doesn’t seem like I need to use any acpi_osi anymore! It was necessary for some years ago (getting PRIME to work and such), but I never bothered to test removing it after getting new BIOSes…

I think you can use bbswitch. It’s worth a try in any case. I suggest optimus-manager, which can handle all this automatically after you configured it.

You were right! I swear bbswitch did not work last I tried (which must be like a year ago or something), guessing it was because of that acpi_osi debacle…

So, now I use the same method as I did on my late XPS 9550, a script which changes configs to either using provideroutputsource with PRIME sync if I want that, or loading of bbswitch and only running the intel card.

Thanks for help, again! Shame that it doesn’t support D3, but oh well, the next laptop will support it I guess…

So, no official word from Nvidia yet … stuck with 5.9 for now

Some attention here ? 5.10 is LTS. I am with RTX 5000 and can not use newer kernel and 5.9 is EOL, so options are 5.4 and miss all fixes to BTRFS or go to 5.10 and have no power management at all … Just someone post “we are working on it”

As far as you know, how likely is it that the upcoming NVIDIA 460 driver will fix this issue? Or will we need to wait longer?

@phusho in the meantime I use this workaround, which however is based on bumblebee and thus it has slightly worst performance than the official Prime Offloading, but it’s still something.

It is not fixed in 460.32.03.

Ok, I guess we’ll have to wait longer then. Thanks anyway

for now this workaround works, you can customize with your bus address:

#!/bin/bash

journalctl -f | \
while read line ; do
    echo "$line" | grep "kernel read not supported for file pci0000:00/0000:00:01.0/0000:01:00.0/config"
    if [ $? = 0 ]
    then
           dd if="/sys/bus/pci/devices/0000:01:00.0/config" bs=1 count=1 of=/dev/null
    fi
done

I’ve just upgraded my Manjaro installation, and I can confirm that with kernel 5.10.7-3 and NVIDIA driver 460.32.03 the power management is still not working, unfortunately. Since I do not use BTRFS I decided to downgrade to kernel 5.4 until the bug gets fixed.