Fan speed regression with NVIDIA beta 470.42.01 and RTX 3080 (fans don't stop on idle)

edit: using Linux kernel 5.12.15 in both cases.

Using a 3080 with 465.31 sees a fan speed of 0% on idle.
With 470.42.01 the lowest is 30%.
Trying to change it does not work, everything above 30% works.

e.g:
works:
nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=60" -a "[fan:1]/GPUTargetFanSpeed=60"

does not work:
nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=0" -a "[fan:1]/GPUTargetFanSpeed=0"

Summary: on 465.31 without touching any settings the automatic fan control will shutdown the fans on idle, with 470.42.01 this is no longer the case.

1 Like

This also doesn’t work on the new 470.57.02 release.

To even be able to change the fan speed (it seems), you need to create/add the following to “/etc/X11/Xwrapper.config”:

allowed_users=anybody
needs_root_rights=yes

Also set:

sudo chmod u+s /usr/lib/Xorg.wrap

Now X runs with root permissions, great…

Now you can do (e.g.):

nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=60" -a "[fan:1]/GPUTargetFanSpeed=60"

But you can’t set it to anything below “30”, e.g:

nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=21" -a "[fan:1]/GPUTargetFanSpeed=21"

or

nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=0" -a "[fan:1]/GPUTargetFanSpeed=0"

will set fans to 30%.

tl;dr: The fans will never idle (0 RPM) and ALWAYS be set to AT LEAST 30%.

This issue does not happen under Wayland.
This issue does not happen on 465.31.

1 Like

Works… Fedora 34, RTX 2080 SUPER

Would be nice to get feedback from other 3000 card users, or nvidia…

WFM, GTX 1660 Ti here.

I have the same problem with GP107 [GeForce GTX 1050] in an eGPU box on Linux kernel 5.10.52 with 470.63.01 drivers. Setting anything below 30 to [fan:0]/GPUTargetFanSpeed has no effect on fan speed (its sound) and behaves exactly like 30%. The first difference is between 30% and 31%.

Temp of my card is reported at 24 ËšC, so the fan could/should go idle.

I have a 3060 and have the same problem with 460.91.03-1 (the Debian bullseye default) and 470.42.01 (the latest stable driver directly from Nvidia). I can control the fan speed 30-100, but I can’t reduce it below 30. It’s at 30% even when idle, but this isn’t a problem before Xorg boots (e.g. at BIOS and the disk encrypt prompt it’s silent) so I’m pretty sure it’s the driver.

@nvidia998 did you have any luck resolving this?

Sadly no, I’m currently using 465.31 which goes to 0% fan speed on idle.

Current setup (arch):

  • linux 5.13.13.arch1-1
  • xorg-server 1.20.13-2
  • lib32-nvidia-utils 465.31-1
  • lib32-opencl-nvidia 465.31-1
  • nvidia-dkms 465.31-1
  • nvidia-settings 465.31-1
  • nvidia-utils 465.31-1
  • opencl-nvidia 465.31-1

This issue persists with all recent drivers, including the new beta driver 495.29.05.

From the feedback in this thread it might only happen on Ampere.

I would appreciate if someone from Nvidia/QA could reproduce/confirm this.

Test setup: Ampere GPU with driver newer then 465.31.
This is the last driver I think (not 100% sure) that works without issues.
Linux kernel version 5.12 or greater (not sure if this has anything to do with it).
X11, not Wayland.

I have filed a bug 3408200 internally for tracking purpose.
Thanks for letting us know.

1 Like

Here is my test results with 470 drier

When I have not enabled GPU Fan Settings, I can see GPU FAN utilization as 0% in idle state.
But when I enabled GPU Fan Settings under nvidia-settings, I can see GPU FAN utilization as 30% in idle state.
Also I can not make changes in fan speed below 30, however can make changes above 30 successfully.

However with driver 465, I do not see option of enabling GPU Fan Settings under nvidia-settings - > Thermal Settings and see GPU FAN utilization as 0% in idle state.

Please confirm if you also have the same observation and the only reported problem is that we can not decrease GPU FAN utilization below 30%.

@amrits: I just did tests with 465.31, 470.74 and 495.29.05.


I can confirm that 0% fan speed is only possible if we disable GPUFanControlState: nvidia-settings -a "[gpu:0]/GPUFanControlState=0"

If we enable GPUFanControlState: nvidia-settings -a "[gpu:0]/GPUFanControlState=1"

The only values the fan(s) can have is between 30% and 100%.

Does not work:
nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=0" -a "[fan:1]/GPUTargetFanSpeed=0" (Result is 30%)

nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=20" -a "[fan:1]/GPUTargetFanSpeed=20" (Result is 30%)

Does work:
nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=30" -a "[fan:1]/GPUTargetFanSpeed=60" (Result is 60%)
nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=100" -a "[fan:1]/GPUTargetFanSpeed=100" (Result is 100%)


One important aspect: this only works if we run Xorg (X11) as root.

Example output if Xorg is not run as root:

nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=60" -a "[fan:1]/GPUTargetFanSpeed=60"

  Attribute 'GPUFanControlState' (zen:0[gpu:0]) assigned value 1.


ERROR: Error assigning value 60 to attribute 'GPUTargetFanSpeed' (zen:0[fan:0]) as specified in assignment '[fan:0]/GPUTargetFanSpeed=60' (Unknown Error).



ERROR: Error assigning value 60 to attribute 'GPUTargetFanSpeed' (zen:0[fan:1]) as specified in assignment '[fan:1]/GPUTargetFanSpeed=60' (Unknown Error).

Check if Xorg runs as root:
ps -o user $(pgrep Xorg)

USER
root

I assume on your test system (Ubuntu?) the default is to run Xorg as root.
On other distros (e.g. Arch) the default is to run Xorg as user (but many display managers do not support rootless Xorg, so this would explain while it maybe was not an issue in nvidia’s testing/QA).

I had to create /etc/X11/Xwrapper.config with the following content:

allowed_users=anybody
needs_root_rights=yes

and set sudo chmod u+s /usr/lib/Xorg.wrap, after restarting Xorg the fan settings worked.

nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=60" -a "[fan:1]/GP UTargetFanSpeed=60"

  Attribute 'GPUFanControlState' (zen:0[gpu:0]) assigned value 1.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:0]) assigned value 60.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:1]) assigned value 60.

I think the last time this worked w/o Xorg running as root was 460.91.03, but I can’t rollback that far (kernel seems to be to new).


Summary:

  • Xorg needs to run as root user (this should be fixed by nvidia, this is a security risk).
  • If fan control is enabled, the fans can never reach 0%, only values between 30-100% (bug!).
  • This seems to be true on 465.31, 470.74 and 495.29.05. I did a lot of down/upgrading and rebooting…

Of course now that I posted all of this and running 495.29.05 I can’t seem to get the fans to 0% no matter what (current temp 35°C), Xorg is also running as root currently.

Fans stuck on 30%, with or without GPUFanControlState set to 0 or 1.

edit:
now (after ~5min uptime) all of a sudden it seems to work again.
With GPUFanControlState set to 1: I can now freely set between 30 and 100%.
With GPUFanControlState set to 0: fans go to 0%.

This is so very strange…

Short update for 495.44: same issue as before.

âžś ~ nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=60" -a "[fan:1]/GPUTargetFanSpeed=60"

  Attribute 'GPUFanControlState' (zen:0[gpu:0]) assigned value 1.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:0]) assigned value 60.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:1]) assigned value 60.

âžś ~ nvidia-smi -q | awk '/Fan Speed/ {print $4}'
60
âžś ~ nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=30" -a "[fan:1]/GPUTargetFanSpeed=30"

  Attribute 'GPUFanControlState' (zen:0[gpu:0]) assigned value 1.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:0]) assigned value 30.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:1]) assigned value 30.

âžś ~ nvidia-smi -q | awk '/Fan Speed/ {print $4}'
30
âžś ~ nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=20" -a "[fan:1]/GPUTargetFanSpeed=20"

  Attribute 'GPUFanControlState' (zen:0[gpu:0]) assigned value 1.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:0]) assigned value 20.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:1]) assigned value 20.

âžś ~ nvidia-smi -q | awk '/Fan Speed/ {print $4}'
30
âžś ~ nvidia-settings -a "[gpu:0]/GPUFanControlState=1" -a "[fan:0]/GPUTargetFanSpeed=0" -a "[fan:1]/GPUTargetFanSpeed=0"

  Attribute 'GPUFanControlState' (zen:0[gpu:0]) assigned value 1.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:0]) assigned value 0.

  Attribute 'GPUTargetFanSpeed' (zen:0[fan:1]) assigned value 0.

âžś ~ nvidia-smi -q | awk '/Fan Speed/ {print $4}'
30
âžś ~ nvidia-settings -a "[gpu:0]/GPUFanControlState=0"

  Attribute 'GPUFanControlState' (zen:0[gpu:0]) assigned value 0.

âžś ~ nvidia-smi -q | awk '/Fan Speed/ {print $4}'
0

Gigabyte GTX1060-6GB Running Nvidia driver 460.91.03-1 on Debian Bullseye kernel 5.10.0-9-amd64
I have my Xorg coolbits set etc. Clock speed adjustment still works, etc.

The driver has recently broken with regard to setting manual fan speed.
Previously I could set any arbitrary fan speed and it responded immediately exactly as specified.

nvidia-settings -a [gpu:0]/GPUFanControlState=1
nvidia-settings -a [fan:0]/GPUTargetFanSpeed=80

But now it’s all weird. It doesn’t work properly at all.
Toggling nvidia-settings -a [gpu:0]/GPUFanControlState=1 between 1 and 0 does change it’s behavior, but setting the fan speed has weird effects and it’s extremely unresponsive.

Manual fan control in the current driver is totally broken.

I did not find any difference with setting numbers above/below 30 as others have commented. I’ve tried 0, 10, 20, 30, 50, 60, 70, 80, 100. It’s just totally broken.

I’m having the same issue with GTX 1060 on the latest Debian release (Bullseye) with latest kernel 5.10.0-9-amd64. I tried upgrading nvidia-driver to
*** 495.44-1 from debian experimental/non-free amd64 Packages
but the issue persists.
nvidia-smi shows the fan speed is 122%.
I’ve tried shutting down and powering off the system.
As soon as I turn it on the fans are spinning at a high speed.
Even though I’m not setting it to manual fan control now
nvidia-settings -q [gpu:0]/GPUFanControlState is showing 0
And the whole system is idle, the nvidia-smi shows the 1060 is 25C 14W power usage, and 123% fan speed.

The sound is really annoying.

Same issue with GTX 1650 SUPER on latest Ubuntu 21.10.
nvidia-smi shows I’m running Driver Version: 470.82.00

GPU fan is spinning ~60% (~1900 RPM) on idle and keeping the temperature ~ 33C. The fan spins faster (~65%, ~2050RPM) when I run a game. Afterwards, the fan level drops to 60%.

I generally do this via a script, and start at a higher number, then pause, then drop down the lower number.
Also you are right there are limits to how low you can go. On my cards it is
set “25” # Gigabyte RTX 2080 TI Gaming OC - GV-N208TGAMING OC-11GC
set “39” # Gigabyte RTX 3080 TI Gaming OC - GV-N308TGAMING OC-12GD
Anything lower, and it jumps all over the place, and doesn’t set at the number I tell it to go to.

For me, it hovers between 30 & 0. Fans stop & then start continuously. This is a big issue.

We have been able to root caused the issue, fix is incorporated in driver and will be available in future release.
Shall update once it is available publicly.