It’s becoming a real pain to check each card one at a time on a windows box, just to figure out if a card is going to over heat it’s memory in the racks under linux. Our compute load is heavy on the memory and when we are adding 100 gpus a month to our compute/render farm it’s really becoming [is] a problem to not have memory junction/memory temps in linux. nvidia-smi needs this like yesterday, what is taking so long to get this done?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| 3090FE question | 1 | 735 | July 5, 2021 | |
| Request: GPU Memory Junction Temperature via nvidia-smi or NVML API | 362 | 92948 | April 20, 2025 | |
| Linux operating system can also read vram temperature values like windows | 8 | 4671 | January 5, 2022 | |
| Is there any tool on linux to monitor temperature of GPU memory? | 0 | 441 | May 12, 2022 | |
| Getting Memory Current temperature on V100 | 0 | 611 | March 30, 2021 | |
| GPU utilization on Geforce cards | 3 | 3511 | December 2, 2012 | |
| Under CentOS 7.1 system, eight GPU cards are checked with nvidia-smi command, one card is error, one card is lost, and when nvida-smi command is executed, carton is slow. | 3 | 888 | January 8, 2019 | |
| NVIDIA L40S keeps dropping from nvidia-smi | 1 | 271 | July 11, 2025 | |
| Reading the memory (junction) temperature via NVAPI | 5 | 3432 | July 27, 2021 | |
| measuring temperature | 3 | 2410 | November 5, 2009 |