Has anyone encountered this problem and can offer a solution -
I have a headless machine (no x server) running Ubuntu 16.04 with multiple GPUs (1080 and 2080). I recently upgraded the Nvidia drivers to accommodate the new RTX 2080 GPUs from 390.48 to 410.78. Since the upgrade the GPUs run at P0 performance mode, consuming high power resulting in high Idle temperatures. Reverting the driver back to 390.48 results in proper power management (P8 on Idle). The problem also occurs with the 410.93, 415.27 and 418.43 drivers.
Problem also occurred on a machine running Ubuntu 12.04 with 1080 and 980 GPUs.
nvidia-smi with 410.78 driver on Idle (with one 1080 GPU) -
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
Thu Feb 21 09:51:18 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 410.78 Driver Version: 410.78 CUDA Version: 10.0 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:02:00.0 Off | N/A |
| 38% 54C P0 44W / 180W | 10MiB / 8119MiB | 0% Default |
±------------------------------±---------------------±---------------------+