Question : RTX A6000 Max operating temperature

Hello sir, I want to build a GPU Server. (RTX A6000)

So I want to know the maximum operating temperature of RTX A6000.
And will the GPU temperature be stable at 88~92’C (Celsius)?

thank you

Professional GPUs like the RTX A6000 are intended by NVIDIA to be integrated into complete systems approved system by integrators and partners and not installed by end users. See this page for NVIDIA’s list of approved vendors:

In passively-cooled GPUs cooling equates to a required minimum amount of airflow across the GPU; it may also be specified directionally. I assume that NVIDIA shares the precise requirements with their partners, but not end users as that is contrary to the intended use. Looking at a picture of the RTX A6000 it appears to be actively cooled. In that case you still have to ensure free airflow around the card (a common problem is restricted airflow due to other PCIe cards plugged in in the immedate vicinity), and need to make sure case temperature is kept low.

Generally speaking running a GPU hot has two negative impacts: (1) It can reduce performance when the thermal management of the GPU lowers clock frequencies to reduce power consumption and heat dissipation in order to protect the hardware. Each type of GPU has thermal multiple limits applied via VBIOS and driver software that govern slowdown and emergency shutdown temperature, (2) All semiconductors age physically via processes that are accelerated at higher temperatures (Arrhenius law). Running a GPU hot can therefore shorten its life-span.

I am not familiar with the RTX A6000, but the operating temperature of many recent NVIDIA GPUs should typically not exceed 83°C or thereabouts. Running one at 88~92°C is like redlining a car’s engine: not advisable. To achieve maximum boost clocks and best performance it is usually beneficial to keep temperature at or below 60°C. That may require water cooling for a 300W monster like the RTX A6000.

1 Like

Thank you, sir

I would like to inquire further.

What I’m curious about is that when I tested the OEM’s A6000 server through GPU-BURN, it measured up to about 87.5~90 degrees, so I was curious about the stability of the GPU temperature.

thank you

If you already have an RTX A6000 running, you can use nvidia-smi to inspect the thermal limits for that particular model: see the section “Temperature” in the output of nvidia-smi -q. Here is an example output from one of my GPUs:

    Temperature
        GPU Current Temp                  : 42 C
        GPU T.Limit Temp                  : N/A
        GPU Shutdown Temp                 : 97 C
        GPU Slowdown Temp                 : 94 C
        GPU Max Operating Temp            : 92 C
        GPU Target Temperature            : 83 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A

The 83°C I mentioned is shown as the “GPU Target Temperature”. Note that the items displayed by nvidia-smi tend to differ between GPU architectures.