Lack of a third fan reported by drivers is really starting to cause issues

So, the Nvidia drivers on Linux only report up to two fans, regardless of how many fans the GPU has. This makes sense for FE/Reference cards that only have two fans, but most AIB cards, especially newer/high end ones, have three fans and are not designed to run with only 2.

Well, I can’t get my third fan to come on. No, it’s not hardware. This is the second RTX 3090 I’ve had this issue with, and It works perfectly fine in Windows, I can go in and change the sliders for all three fans, and all three fans respond accordingly. In Linux, when I set manual fan control, I can only control the first two fans because there are only two fans reported. I’ve tried rebooting. I’ve tried reinstalling the drivers. I’ve tried reinstalling my kernel. Nothing has worked.

This is really frustrating. I pay 1700 USD for a top of the line GPU and I can’t even have drivers that report all the fans.

I’m happy to provide any information that might be needed, but I would imagine it’s a simple process to just add capabilities for a third fan to the drivers.

It should be there, at least the description for nvidia-setting’s gpufantarget implies:
“Returns the objects the fan cools. ‘1’ means the GPU, ‘2’ means video memory, ‘4’ means the power supply, and ‘7’ means all of the above.”
So the question is, why the third fan doesn’t get reported, things that come to my mind

  • bug in nvidia-settings or driver
  • third fan not defind in reference design so vendors use a different interface
  • something completely different
    Do you use some vendor-specific application to control the fans in Windows?

It should be there, at least the description for nvidia-setting’s gpufantarget implies:

I don’t see how that would imply anything about detecting more than 3 fans. Both fans 1 and 2 are reported as targeting “GPU, Memory, and Power Supply.” So obviously the third one would be the same, or Nvidia-settings isn’t actually reading what they’re actually supposed to be targeting, either way it doesn’t imply anything about more than two fans.

  • bug in nvidia-settings or driver

I think this is the most obvious solution. Nvidia has never made a card themselves with more than two fans to my knowledge, and there is so much old legacy crap in the Linux (and probably Windows but idk) driver that it’s REALLY easy to believe that they just never even added an interface for a third driver. And since they don’t use hwmon, we can’t look and see if Linux itself is detecting it (which it is though, because sometimes randomly it will work, but VERY rarely, and I can never force it).

  • third fan not defind in reference design so vendors use a different interface

I mean, if that were the case, fans 1 and 2 wouldn’t be individually controllable either. But they are. And I find it pretty unlikely that they would what, use the defined interface for fans 1 and 2 but use some crazy one for fan 3? And even if that were the case, it wouldn’t prevent Nvidia from adding support to the driver, AMD GPUs are able to detect all three fans no matter who makes it, and AMD hasn’t added anything vendor-specific to their drivers.

  • Do you use some vendor-specific application to control the fans in Windows?

Well what’s a non-vendor-specific utility I could check to see if all three fans are detected? Obviously I use Precision X1 because it’s an EVGA card. But if there’s something else, I’ll check and see what happens.

But I mean, regardless the bottom line is that Nvidia needs to add support for a third fan into their drivers. Running a 375 Watt $1700 USD GPU with 2 fans on a heatsink designed to run 3 is NOT a good idea, and not to mention that third fan is honesttly one of the most effective since it’s where the passthrough part of the card is, so it’s just blowing straight through heatsink with no PCB blocking it. So it cools down the whole card. I need this fan.

I’ve even tried booting into my VFIO VM (single-GPU passthrough) and maxing out the fans and then shutting it down in hopes that the fans would stay maxed out when I log into Linux, but essentially the card resets when that happens so it’s as if I rebooted, and nothing gets saved.

If there are three singular targets that can be defined, it’s not too far fetched that initially three fan interfaces were intended.
There’s a defined RGB lighting interface, yet most vendors build their own, proprietary one.
I guess on Windows, there’s no official nvidia tool to control the fans, does e.g. msi afterburner show all fans?

I guess on Windows, there’s no official nvidia tool to control the fans, does e.g. msi afterburner show all fans?

I’m not sure, I’ve not really used much MSI Afterburner (I only ever use Windows in my VFIO VM and only then to play Apex Legends), but I do have it installed in the VM so I’ll boot it up this afternoon, turn off the Precision X1 “launch at startup” option so I know it’s fan settings aren’t contaminating things, and then reboot the VM and launch Afterburner and see what happens.

Nvidia does provide a vendor-independent API for getting the fan speed for each fan:

nvmlDeviceGetFanSpeed_v2

However, it doesn’t work properly even with 2-fanned cards such as my EVGA GTX 1080 FTW2. Only index 0 is valid despite each fan being individually addressable in EVGA’s Precision X.

Some people have also reported GPU fans over 101%, I’m going to guess this is all either a bug in the Nvidia driver or vendors violating spec.

Edit: to clarify, 101% fan speed is perfectly normal if you manually set the fan speed to 100% since there is a +1/-1 variance.

Yeah that’s what nvidia-smi uses, right?

And sure enough, nvidia-smi only shows one fan, fan 0. But nvidia-settings shows 2 fans, as does GWE. I wonder if this is an issue with nvidia-smi and not nvml?

Honestly though this is really starting to get to me. Switching to Windows isn’t an option, but I don’t want to lose a fan, and I know Nvidia could fix this. I REALLY don’t wanna by a Hybrid kit, but I guess that would also solve MY problem, but not everyone else’s, so that’s not ideal.

Yep, it is.

nvidia-smi might be wired to use the v1 function version, don’t know.

You might be able to force the third fan to run at a constant speed. Go to Precision X, manually set the fan speed, and reboot. I did this for my GPU and the second fan, which normally only spins up after hitting 60C+, spins at that speed until hitting 60C.

We’re tracking investigation of this in bug 200701947.

Any chance NVML/NVCtrl will be fixed so that each fan is addressable?

Is there anywhere to track that? If not that’s okay, honestly I do feel a bit better just knowing you guys are looking into it.

No, the bug tracker is internal but having the number for reference is helpful for future correspondence.

No that’s fine, yeah I was just wondering why you gave the bug number, that makes perfect sense, thanks