Testla T4 always stays in P0 pstate

Hi,

I’m wondering if anybody has any clues on why T4 card always stays in P0 state? Being always in P0 forces the fans to kick-in in the server chassis, which is pretty loud. So far, I found extremely weird workaround - running nvidia-smi with “-l” option to display the status and it forces the card to P8 after about 10 seconds and it stays that way until your terminate nvidia-smi. (running multiple times without “-l” option doesn’t have the same effect)

The card does react to load and switches to P0 if you run any tasks on it in parallel, then switches to P8 once done.

So, looking for way too NOT run nvidia-smi constantly and have the card switch to P8 automatiocally where is no load.

Driver Version : 460.32.03
CUDA Version : 11.2

Attached GPUs : 1
GPU 00000000:B6:00.0
Product Name : Tesla T4
Product Brand : Tesla
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Disabled
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
GPU UUID : GPU-9b2140ad-1dd6-e639-b276-29f45f18bd2a
Minor Number : 0
VBIOS Version : 90.04.38.00.03

Tue Apr 13 17:54:45 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:B6:00.0 Off | Off |
| N/A 35C P0 15W / 70W | 0MiB / 16127MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

Please correctly set up nvidia-persistenced.

1 Like

Omg, thank you very much for the hint