Request: GPU Memory Junction Temperature via nvidia-smi or NVML API

Please please implement this in linux driver.

So needed feature for linux users and many projects depend on that.

3 Likes

+1

2 Likes

+1 I need this feature. Either add it or fix your thermal pads. I’m tired of your nonsense, Nvidia.

3 Likes

+1 Temp is a MUST!

3 Likes

Currently finetuning a machine learning model, this feature would be very useful. However, constantly monitor the temps of the GPU is not the correct solution, Nvidia should replace the FE it has sold with ones with adequate cooling.

3 Likes

Let’s be realistic; that’s just not going to happen, and if it did it would take so long that the next series of GPUs would be out by now given the chip shortages.

However, having the ability to monitor the temps is a very quick and easy change that will go a long way for all of us.

3 Likes

You probably missed this.

Also spamming +1’s after this linked message doesn’t make process any faster.

1 Like

I am unable to edit my OP, but @wpierce, feel free to edit the OP with the current status. It would be unfair to mark your post as a solution.

Spamming +1’s may actually make the process faster implicitly. More accurately, what it does is serve as a good proxy of the number of customers significantly impacted by this problem (after all, they went through the effort of posting). The higher that number is, the higher Nvidia should prioritise this problem and ought to allocate resources accordingly.

4 Likes

Please add this feature on Linux, thank you

4 Likes

Necessary feature for many users and admins. Please add this feature for Linux as well. Thanks a lot

3 Likes

+1
to cool the card adequately, the temperature info is essential.

1 Like

+1
This is an absolutely essential feature with the high VRAM temps of the 30 series cards. I’m able to test cards on a Windows bench to determine undervolting settings for safe operating temperatures before setting them up in Linux, but not every work environment allows for this.

1 Like

What settings have you found to keep the temp in check?