[GTX 1070][driver 4.15.25] Performance mode (p0) does not automatically downshift to idle (p8) after...

(all status monitored with ‘watch -n 5 nvidia-smi’ output)

System runs a bash script on login manager (lightdm) called ‘nv-idle’ to set the baseline config consisting of:

nvidia-settings -a “[gpu:0]/GPUGraphicsClockOffset[3]=0”
nvidia-settings -a “[gpu:0]/GPUMemoryTransferRateOffset[3]=0”
nvidia-settings -a “[gpu:0]/GPUPowerMizerMode=0”
nvidia-settings -a “[gpu:0]/GPUFanControlState=1” -a “[fan:0]/GPUTargetFanSpeed=20”

In effect: baseline clock speed, adaptive clock mode, 20% manual fan.

Upon resuming from a s3 suspend state, p0 is selected and will not downshift automatically into lower modes. Manually calling the ‘nv-idle’ script will downshift the performance mode by increments going from p0 (33w) -> p0 (28w) -> p2 (27w) -> p5 (8w) -> p8 (6w) usually requiring calling the script 4-6 times as it steps down in increments and stays there. Each displayed reduction in power / power mode is accompanied by a corresponding temperature change. Upon running any new gpu process (chrome/firefox) it will return to p0 and start the process over again.

Upon a reboot, the initial p0 state is selected after an application is launched and then is automatically lowered without demand settling at p8 automatically.

Simplified steps:

  1. Reboot system.
  2. Set gpu settings with above ‘nv-idle’
  3. Open new terminal, and observe gpu status with ‘watch -n 5 nvidia-smi’
  4. Launch new application utilizing gpu (chrome/firefox).
  5. Observe p0 performance state now set.
  6. Close application. (chrome/firefox)
  7. Observe performance state returns to p8 over the next 30s or so.
  8. Put system into s3 sleep and then resume a minute later.
  9. Repeat steps 2-6 and observe the performance state remains at p0 indefinitely.

Basics:
mobo: Asus x370 ROG strix
cpu: Ryzen 1700
gpu: GTX 1070
gpu driver: 4.15.25
kernel: 4.20.0
Xorg : 120.3
distro: Manjaro 18.0.2 (cinnamon)

nvidia-bug-report (minus dump): https://pastebin.com/1ptQMW9F

GPUPowerMizerMode=0 means no adaptive clocking since it disables the powermizer.

GPUPowerMizerMode=0 means no adaptive clocking since it disables the powermizer.

This is incorrect. This control has no disable state only auto (2) / adaptive (0) / performance (1).

Sorry, I confused it with PowerMizerEnable.

same problem with a gtx 1080 on arch running kernel 4.20.0 / nvidia 415.25. resuming from s3 keeps the card at performance level 3, only a reboot gets the card back to switching between level 0-3.

so after every suspend/resume the fans keep spinning up and down every few seconds…

mobo: asus strix x470-i gaming
cpu: ryzen 2700
gpu: gtx 1080
gpu driver: 415.25
kernel: 4.20.0
Xorg : 1.20.3
distro: arch with cinnamon

Similar problem [1], but with hibernate to disk. The nv-idle script or the update to Kernel 4.19 and driver version 415.25 sometimes allows me to hit low power state P8 again after a resume. But opening a video player or similar results in it being stuck in P0 again and not coming down at all on its own.

1: https://devtalk.nvidia.com/default/topic/1044143/linux/stuck-in-max-performance-mode-after-hibernation-perf-state-p0-level-4-410-73-rtx-2070-ke-/post/5297121/

Hi All,

It has been fixed in latest driver release 415.27, please install the same. Below is the URL where you can find NVIDIA drivers.

https://www.nvidia.com/object/unix.html

What about https://devtalk.nvidia.com/default/topic/1002912/linux/very-slow-ramp-down-from-high-to-low-clock-speeds-leading-to-a-significantly-increased-power-consumption/ ?

Hi Artem,

This is still open and team is investigating. Will keep you posted on the same.

Thank you!

This does indeed appear to be fixed in 415.27, thank you. The ramp down is slightly slower than 415.25 from what I’ve seen so far but it does not suffer the original issue.