We have a project made of 26 PCs, each of them driving 2x RTX 3080 cards (so 52x RTX3080 cards total). The 3080 cards rely on a turbo fan design (gigabyte).
The end user has reported an issue where the rear panel of the PC cases are extremely hot. The PCs are rack-mounted in a proper server room (cooling etc.).
The GPU temperature reported by nvapi is 88°C under standard load (can reach 89°C temporarily sometimes, but mostly steady at 88°C). This looks a bit high in my eyes, however, our hardware provider told us 88°C is the normal operation temperature of the serie 3000 cards, before throttling occurs.
Hi there @gjaegy and welcome to the NVIDIA developer forums!
Do you know nvidia-smi? Try out this command:
nvidia-smi.exe -q -d TEMPERATURE
==============NVSMI LOG==============
Timestamp : Wed Apr 12 10:00:38 2023
Driver Version : 531.29
CUDA Version : 12.1
Attached GPUs : 1
GPU 00000000:0A:00.0
Temperature
GPU Current Temp : 40 C
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 93 C
GPU Target Temperature : 83 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
This is for an RTX 3080 that I am using, Windows 11. But nvidia-smi is also installed on Linux systems if you are using NVIDIA drivers.
So yes, 88C is still totally ok. Especially given the non-spec setup you are using. GeForce GPUs are not designed for Server setups in rack-mounted cases. Regardless of the fan setup. I am not surprised that the rear exhaust area is getting hot, since in a typical rack design the outlet fans or openings are pretty small.
I’m not familiar with that tool. Looking on the dedicated webpage it seems to support professional cards only, not GeForce cards, is the webpage outdated/wrong?
The tool supports any NVIDIA GPU which is running with supported NVIDIA drivers. The comment about professional cards is mostly with respect to some advanced features that might not be available on GeForce consumer cards.
I think these kind of questions are better suited for the consumer forums for GeForce GPUs, there are a lot of discussions on expected temps for different third party AIB manufacturers.
This forum here is for development purposes focusing on our SDKs and tools.