I found a fix for my issue.
The official release notes mention
Enabled the nvidia-drm fbdev=1 option by default. When supported by the kernel and the nvidia-drm modeset=1 option is enabled, nvidia-drm will replace the system’s framebuffer console with one driven by DRM.
This feature can be disabled by setting fbdev=0.
I don’t have any monitor connected to the GPU so there shouldn’t be any need for this. I added options nvidia-drm fbdev=0
in /etc/modprobe/nvidia.conf and the power consumption went down from 18W to 7W on the next boot.
That also seemed to have fixed some flickering I started to notice during the boot sequence.
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15 Driver Version: 570.86.15 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3080 Ti Off | 00000000:01:00.0 Off | N/A |
| 0% 41C P8 7W / 200W | 2MiB / 12288MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
@damob7890 and potentially @morgwai666 pretty sure this option would work for you too since you have an otherwise completely empty workload on your GPUs like I do.