Fan Control Issue on Open NVIDIA Driver 570 – Fan Stuck at 30% Under Load (nvidia-ml-py Attempted)

Subject: Fan Control Issue on Open NVIDIA Driver 570.86.15 – Fan Stuck at 30% Under Load (nvidia-ml-py Attempted)

Hello,

I’m reporting a fan control issue on my headless Ubuntu 24.04.1 LTS system running the NVIDIA open driver (version 570.86.15). Under heavy load, the GPU fan remains fixed at 30% even as the GPU temperature rises significantly.

System Configuration

  • OS: Ubuntu 24.04.2 LTS (Noble Numbat) (headless, accessed via SSH)
  • Kernel Version: 6.8.0-53-generic
  • GPU: NVIDIA RTX A2000
  • Driver: NVIDIA open driver, version 570.86.15
    • CUDA Version: 12.8

Issue Description

When I run a stress test using ./gpu_burn 300, the GPU temperatures climb up to 95°C under load, yet the fan speed remains at 30% throughout the test. For example, the output from gpu_burn shows:

...
42.3%  proc'd: 544 (3989 Gflop/s)   errors: 0   temps: 86 C
        Summary at:   Sun Feb 16 18:43:29 CET 2025
...
100.0%  proc'd: 864 (923 Gflop/s)   errors: 0   temps: 95 C
Killing processes with SIGTERM (soft kill)
Freed memory for dev 0
Uninitted cublas
done

Tested 1 GPU:
        GPU 0: OK

The current output from nvidia-smi confirms the fan is at 30% despite the high temperature:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15              Driver Version: 570.86.15      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A2000               Off |   00000000:07:00.0  On |                    0 |
| 30%   96C    P2             43W /   70W |    4734MiB /   5754MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           28139      C   ./gpu_burn                             4716MiB |
+-----------------------------------------------------------------------------------------+

I also attempted to influence the fan speed by setting a target temperature using:

sudo nvidia-smi --gpu-target-temp=65 -i 0

Yet, the fan speed remains unchanged.

Attempted Workaround Using nvidia-ml-py

In an effort to gain manual control over the fan speed, I explored using the Python NVML wrapper, nvidia-ml-py, and even tested a script from the repository fan_control_nvidia-ml-py. Unfortunately, these approaches did not yield any change—the fan continues to run at 30% regardless of the commands issued.

Request for Guidance

I would appreciate any insights on the following:

  • Is manual (or dynamic) fan control expected to work with the current open driver (570.86.15) on headless systems?
  • Are there any known workarounds or settings that can enable higher fan speeds under load using the open driver?
  • Is this behavior a known limitation in the open driver stack, with future releases likely to include more complete fan control support?

Any feedback or suggestions would be greatly appreciated.

Thank you for your efforts in advancing the open driver stack!

Best regards,
Simone Flavio

nvidia-bug-report.log.gz (328.7 KB)

EDIT: I had the same problem with 565 version. I hoped to fix it with an upgrade after I read 570 Changelog:
“Updated the nvidia-settings control panel to use NVML rather than NV-CONTROL to control GPU clocks and fan speed.”

1 Like

have you tried using nvidia-settings?
sudo nvidia-settings --assign GPUFanControlState=1 --assign GPUTargetFanSpeed=100
This works fine on my 3090 with 570.86.15 on Debian-13.

Nevertheless, the speed should definitely increase automatically at such high temps as you reported…

I tried that method an hour ago, but it didn’t resolve the issue. I appreciate your help, though.

did you get any error message from nvidia-settings or did it just silently ignore you?

I appear to be having a related issue with a slightly more precarious duty cycle.

System Configuration

  • OS: Ubuntu 24.04.2 LTS (Noble Numbat) (Kubuntu Flavor, Desktop)
  • Kernel Version: 6.8.0-53-generic
  • GPU: NVIDIA RTX 3090
  • Driver: NVIDIA driver, version 570.86.16
    • CUDA Version: 12.8

Issue Description

When I run games or other applications, the GPU temperatures climb up to 95°C under load, yet the fan speed remains at 0% throughout the test. The card then throttles down to keep the temp from rising, but the fans still do not start running. This has been verified visually as well as by sensor.

Attempted Workaround Using Multiple Tools

Efforts to control fan via nvidia-settings, and setting CoolBits to 28, in x config, both manually and via nvidia-xconfig have failed to produce results. Fan control is unavailable in all attempted tools (nvidia-settings, greenwithenvy, CoolerControl, fancontrol-gui, fancontrol.)

I have tried the above command as well, but only received that the control and values had been set. No action was actually taken by the GPU.

  Attribute 'GPUFanControlState' (Calinara:0[gpu:0]) assigned value 1.

  Attribute 'GPUTargetFanSpeed' (Calinara:0[fan:0]) assigned value 100.
  Attribute 'GPUTargetFanSpeed' (Calinara:0[fan:1]) assigned value 100.

nvidia-bug-report.log.gz (472.2 KB)