Simple but really annoying problem, i am trying to figure out how to configure properly " PCI-Express Runtime D3 (RTD3) Power Management" since the NVIDIA dGPU uses 11W of power on IDLE without any process running on it making the overall power usage of my laptop <22W/h draining the battery very fast and producing more heat since the dGPU never powers off. The open source NOUVEAU drivers shutdowns the dGPU properly doing the overall power usage a lot less than than the official ones since the dGPU is actually without power when is not used.
So i did some research about this problem and i just found this: download.nvidia.com/XFree86/Linux-x86_64/460.27.04/README/dynamicpowermanagement.html
I tried setting NVreg_DynamicPowerManagement to 0x01 and 0x02 but it didn’t work, the dGPU is still with power without any processes running on it. (Yes, i regenereated the initramfs to apply the settings of /etc/modprobe.d/nvidia.conf)
This is the current output of nvidia-smi
Fri Mar 17 14:29:06 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| N/A 46C P8 11W / 55W | 0MiB / 8192MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
This is the current output of /sys/bus/pci/devices/0000\:01\:00.0/power/control:
cat /sys/bus/pci/devices/0000\:01\:00.0/power/control
on
Any clues? I am at mainline Linux kernel 6.2.6, using nvidia open kernel modules. Thanks in advance nvidia-bug-report.log.gz (1.1 MB)
The open kernel modules are alpha quality regarding geforce gpus, lacking any power management features. You also won’t be able to suspend/resume (unless this was added in recent versions). Please use the standard driver.
Thanks, but this creates other problem that i can’t use the fully closed source ones (nvidia or the dkms ones) because it hangs at boot with traps: Missing ENDBR: _nv012309rm+0x0/0x10 [nvidia] That’s why i am using nvidia open ones. (I can’t do nvidia-bug-report.sh since it can’t even change tty because of that error)
Yeah i was just writhing that, is fixed. Now that i am using the fully closed source ones but the dGPU is still powered on when no process is using it even in a TTY without any Xorg or Wayland desktop environment running. Any clues?
Don’t use nvidia-smi to check runtime suspend, it wakes up the gpu (therefore the small delay). Instead, running
cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status
should return “suspended”.
Also, options shouldn’t be set, those are autoconfigured by the driver, only to be changed on issues.
How i do that? I never messed up with initrd before
And /sys/bus/pci/devices/0000:01:00.0/power/runtime_status successfully reports suspended and when i run nvidia-smi it returns active and after a few seconds it changes to suspended again as intended. Now my laptop W/h usage is a lot better (About ~10W). Suspending (S3 and Suspend-to-Idle, Closing the laptop lid) works good too, that was also a issue before
Since i am using ArchLinux i attached all the nvidia modules in the MODULES section that is supposed “to be loaded before any boot hooks are run” so they can be loaded at early boot, i am right?.
I noticed recently that this thing of RTD3 that automatically suspends the GPU makes my programs open with a noticeable delay because it wakes up the dGPU every time. Any fixes?
When the dGPU is active, all programs open instantly, but when is suspended state, every program that i open has a delay on it because its waiting to the dGPU to be active (for some reason, even if it will not be used). This did not happen with nouveau either.
nvidia-drm also needs to be in the list of modules to be added to the initrd.
No idea though why any application, even the file manager, is waking up the nvidia gpu. That’s definitely not the normal behaviour.
Thanks, at least the main problem is solved, i will add nvidia-drm now too. I will ask about the problem of programs activating the dGPU in another thread or in forums. Thanks for your help!