Increased idle consumption with driver 570

I found a fix for my issue.
The official release notes mention

Enabled the nvidia-drm fbdev=1 option by default. When supported by the kernel and the nvidia-drm modeset=1 option is enabled, nvidia-drm will replace the system’s framebuffer console with one driven by DRM.
This feature can be disabled by setting fbdev=0.

I don’t have any monitor connected to the GPU so there shouldn’t be any need for this. I added options nvidia-drm fbdev=0 in /etc/modprobe/nvidia.conf and the power consumption went down from 18W to 7W on the next boot.
That also seemed to have fixed some flickering I started to notice during the boot sequence.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15              Driver Version: 570.86.15      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080 Ti     Off |   00000000:01:00.0 Off |                  N/A |
|  0%   41C    P8              7W /  200W |       2MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

@damob7890 and potentially @morgwai666 pretty sure this option would work for you too since you have an otherwise completely empty workload on your GPUs like I do.

1 Like

Thanks for the hint!
I’ve just verified that in case of Debian DC drivers that I use, these options are by default set to 1 in 565’s modprobe config (see /etc/modprobe.d/nvidia-modeset.conf in https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/nvidia-kernel-support_565.57.01-1_amd64.deb ) and yet 565 is still (somewhat) better in power saving when idle, so there have to be something else at work in my case. Anyway, it’s not a huge issue for me (I use my Nvidia card as an eGPU, so I can just disconnect and turn it off when not using it).

At idle, my headless Turing card burns double the watts on 570 vs 565. Same modprobe config:
options nvidia-drm modeset=1

Neither:
options nvidia NVreg_DynamicPowerManagement=0x02
nor
options nvidia-drm fbdev=0

make any difference.

I’m also seeing slightly higher idle power consumption with 570.86.16 (15-20W) than with 550.40.83 (7-15W) on RTX 2080 Ti. In both cases, the desktop is rendered on the RTX 2080 Ti, KDE on X11. Adding options nvidia-drm fbdev=0 on 570.86.16 seems to reduce the power consumption, to readings probably close to 550.40.83.

I’m also seeing marginally higher GPU clocks on idle 570.86.16 without the fbdev option. I have set up GPU clocks monitoring on my desktop, so I can see the clocks all the time. This isn’t anywhere scientific measurement, but I do tend to see higher clock spikes on 570. I’m also seeing more responsive scrolling in Sublime text editor with 570 without the fbdev option than on 550 or 570 with the fbdev option. It’s as if there is sometimes a visible delay between the first tick of the mouse wheel move and it being reflected in the text editor UI.

I suspect, something in the new driver keeps the GPU at higher power states or wakes it up more often than with older drivers, and this may be related to fbdev. With the older driver, the GPU is more often sleeping and that causes lower idle power and longer wakeup times that cause the extra latency in Sublime UI. Again, that’s just my theory.