Request: GPU Memory Junction Temperature via nvidia-smi or NVML API

This is insane not providing this feature by the launch… especially because of the thermal disaster with 3090 cards, and I am not even talking about mining. Keeping people updated is one of the few things you are supposed to do, or at least being honest if its ever coming or not.

3 Likes

I have the following on Ubuntu 20.04 LTS:

Fan control: GitHub - nan0s7/nfancurve: A small and lightweight POSIX script for using a custom fan curve in Linux for those with an Nvidia GPU.
Temp monitoring: hardware sensors. “indicator-sensors” from snap.

As you can see over there, I haven’t been able to set it up right for my 3090 without breaking my multi monitor setup. But it works if you have only one or knows how to modify the xorg.conf file .

2 Likes

Does it read memory junction temp? If not, it’s Off Topic related to this feature request. NVIDIA, help us with this, as it has a great influence on keeping our VALUABLE cards healthy.

3 Likes

What can we else do in order to get the attention of Nvidia? This thread isnt helping much… unfortunately. The wanted feature is extremely important for the cards and their functionality/safety!

3 Likes

+1 Please get this feature asap! The memory temps don’t always correlate with core temps very well which is a massive pain, I want to prolong the cards life as much as possible

6 Likes

We need this! Can’t monitor them on HiveOS!

6 Likes

Any updates on when we will get an update? :D

5 Likes

Can we please get an update on this?

Provide us with an ETA please, much needed functionality.

5 Likes

Following

1 Like

Coming up on 2 months and still no response. What’s the word Nvidia? Your customers are getting pretty irritated.

2 Likes

Paging doctor NVIDIA. Nvidia urgently needed on ER.

Seriously: I do not understand why you provide readings from such an important sensor under Windows, when it is enough to change the system so that applications cannot react to high temperature in any way. If I hadn’t installed Windows for fun, I wouldn’t have found out that my GeForce had 110C for anything.

1 Like

Hello, someone from Nvidia here? that should not be SO difficult.

1 Like

By working as a software developer I can see how this functionality isn’t a top priority for NVIDIA because it wouldn’t make them money (you already bought the cards) and because it’s not a security flaw. Although, as a software developer, I really can’t see why it would take so long to add the VRAM temps to the NVIDIA Linux APIs. The temps should be there, but they’re just not sent to the user. Or maybe they didn’t ever implement the whole VRAM sensors monitoring on Linux and idk how much that would take. I just want to remind people that money comes first (return of investment), but I too desperately need this functionality. @wpierce it would be very nice for us to get another update on the matter, so could you, please? Is it at least in the sprint?

2 Likes

A new stable driver was just released. Hopefully this functionality will be included in the next development cycle!

Yes, please implement this ASAP Nvidia, there are a lot of people needing it pretty much right away due to extremely high VRAM temps. Thanks in advance for a quick implementation!

1 Like

https://www.nvidia.co.uk/Download/driverResults.aspx/177396/en-uk
Still nothing apparently…

1 Like

How could it be that it is impossible to perform this monitoring which is critical to maintaining the system? Deep learning researchers like me (non-commercial) spend their own money on insanely expensive hardware only to find out that there is no way to safely use that hardware. I would feel a lot better about the situation if NVIDIA would be more concrete about when the fix will be coming, or at least reply to this thread more than once every 3 months.

4 Likes