It would be extremely beneficial to be able to monitor the GPU memory junction temperature on Linux through nvidia-smi or the NVML API.
With the latest RTX 3080/3090 series cards using GDDR6X there are growing concerns relating to the temperature of the memory junction. It has been observed that generally performance is throttled at around 110C. As well as hitting this performance throttling, it would be great if Nvidia could expose this temperature through nvidia-smi/NVML API. As far as I know the capability to record these sensors definitely exists in the latest Windows drivers and is exposed through NVAPI. Presumably it’s also in the Linux drivers too, but if not can we get that functionality as well as query access in NVML API or the nvidia-smi tool?
I know there are a bunch of people who want this functionality, so if you’re reading this and are one of them, please post your thoughts or confirm your agreement to convince Nvidia this is a priority.