Request: GPU Memory Junction Temperature via nvidia-smi or NVML API

Thanks - I will definitely use this as ammo.
For interest - what cable did you replace - was it the GPU Power cable ?

Thanks for sharing - and this one use case is firmly on my list.

USB cable (running the card on a riser).

@nadeemm yeah we know how you keep those lists - “things I should find an excuse for, next time someone asks for X or Y”

Don’t worry, folks. Memory temperature isn’t applicable to anything. nvidia-smi told me so.

➜  ~ nvidia-smi -q |grep "Memory Current Temp"
        Memory Current Temp               : N/A

I would also like to request that this be included in Linux - I am working on AI/ML with my 3090’s on linux and quite often I am finding issues that I think are thermal related, but cant prove due to no info being available.

1 Like

I am also a researcher and use a 3090 for my linux workflow. I just got a paper accepted in part thanks to my 3090 ( preprint: DeepVASP-E: A Flexible Analysis of Electrostatic Isopotentials for Finding and Explaining Mechanisms that Control Binding Specificity | bioRxiv, but really could have used the memory junction temperature as well due to my card throttling at times under heavy loads!


Nvidia, please, add this function to control memory temperature under heavy load in Linux (Ubuntu)

1 Like

Aren’t there already hundreds of people loyal to your products demanding you to give them a better software support for the product they just bought ? You love to be criticized ? You enjoy upsetting the community ?
Give people the peace of mind that they won’t burn the hardware with extensive loads 24/7 by sharing a proper monitoring tool and not pointless promises about micron specs and all of that cooking! THAT’S IT !


Please implement the memory junction temp control in linux at last


All we are asking is if we can have the same “GPU Memory Junction Temperature” value in linux as HWinfo and GPU-Z and other programs have in windows (especially for GPUs with GDDR6X memory). Even if they are not the exact Tj max temp of chip, they still give us a very very close estimate to what is the hottest part on the memory module.

I understand that this information is not native to the NVML API (in both windows and linux), but shouldn’t it be added??


Can you do the same comparison for an RTX 3090?

Can you please get a team to legitimately fix this feature in Linux? The length of time this has taken to come out is almost an embarrassment for Nvidia card owners at this point.