However, the GPU appears to always be on, nvidia-smi reports this:
$ nvidia-smi
Fri Oct 11 13:35:12 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 Off | 00000000:01:00.0 Off | N/A |
| N/A 41C P8 4W / N/A | 16MiB / 5934MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1201 G /usr/lib/xorg/Xorg 14MiB |
+-----------------------------------------------------------------------------+
and system power management, this:
$ cat /sys/bus/pci/devices/0000\:01\:00.0/power/runtime_status /sys/bus/pci/devices/0000\:01\:00.0/power/runtime_suspended_time
active
0
Is this beacause the CPU (and the chipset) does not support the needed ACPI power management features? Is this a misconfiguration on my part or is this a driver problem?
This is the only thing that prevents me from using Linux 100% of the time on this machine, so any help will be much appreciated. nvidia-bug-report.log.gz (664 KB)
Just as a note, you can’t really use nvidia-smi for runtime pm detection since it will wake up the gpu.
The setup seems to be correctly done, so when
cat /sys/bus/pci/devices/0000:01:00.0/power/control
returns “auto” and
cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status
still returns “active” then this doesn’t seem to work right now in conjunction with an amd platform. Doesn’t necessarily mean that it doesn’t support this, I’d rather guess that the driver doesn’t expect this combo right now.
Another observation is that when I changed udev config so that auto power management is enabled as soon as the device is added, the GPU actually did switch to a suspended state for some time.
$ cat /sys/bus/pci/devices/0000\:01\:00.0/power/runtime_status /sys/bus/pci/devices/0000\:01\:00.0/power/runtime_suspended_time
active
4372
As you can see, the suspended time is actually more than 0, but the GPU does not suspend at any other time
I guess that it is suspending only until the X driver loads. Then it is kept active.
With debug level set to info, there’s a log flood right now. Interesting would be an info about the PR3 method, probably just at driver loading time. Is it possible to disable X and right after boot run
sudo dmesg |grep PR3
to check if there are some messages about it?
I see this issue is unresolved, so I’m sorry if bumping this annoys you in some way.
I have an Acer Nitro 5 AN515-43 with a Ryzen 3550H CPU and a GTX 1650 GPU, same problem, except trying to use the PRIME render offloading feature on a game while having “NVreg_DynamicPowerManagement=0x02” option in /etc/modprobe.d/nvidia.conf seems to crash the GPU:
There are no problems with PRIME render offloading whatsoever when I have “NVreg_DynamicPowerManagement=0x01” option in /etc/modprobe.d/nvidia.conf, but it seems that the GPU is almost always active with this option:
$ cat /sys/bus/pci/devices/0000\:01\:00.0/power/runtime_status /sys/bus/pci/devices/0000\:01\:00.0/power/runtime_suspended_time
active
1224