How to read GPU temperature from CLI

I’ve found various ways to probe nvidia-settings for the information, for example like:
nvidia-settings -g gpucoretemp
or
nvidia-settings -q=gpu:0
but whatever I do, I get a complaint about the display:

nvidia-settings -g gpucoretemp ERROR: Cannot open display ''. nvidia-settings -q=gpu:0
ERROR: The control display is undefined; please run nvidia-settings --help for usage information.

I assume this is somehow related to this being a headless server.
So, any suggestions?

This is easiest to do via nvidia-smi.

nvidia-settings will also work but as you noticed you have to get picky with the query flags and it may not list all your gpus if they aren’t in the xorg.conf files.

But do it like this:

Also note that nvidia-smi device numbers are not the same as nvida-settings device numbers, which are not the same as CUDA device numbers… always confusing.

In my PC I’m running now, you see I have 3 flavors of cards installed at once. To nvida-settings, the GT240 is gpu:0. To nvidia-smi, it’s GPU 2. To cuda, it’s device 3.

I always add a query flag to my own tools to simply listing the cuda devices so I can decide which ones to activate manually since you can’t really query reliably.

I build this list via simple calls to cudaGetDeviceProperties for each device number.

Yeah, I’m trying to add a device property to give you a good mapping. Hopefully in 3.1.

I call the following command on windows. Hope it helps.

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi -q | findstr Temp

Temperature
    GPU Current Temp            : 72 C
    GPU Shutdown Temp           : 99 C
    GPU Slowdown Temp           : 96 C
    GPU Max Operating Temp      : N/A
    Memory Current Temp         : N/A
    Memory Max Operating Temp   : N/A
Temperature
    GPU Current Temp            : 73 C
    GPU Shutdown Temp           : 97 C
    GPU Slowdown Temp           : 92 C
    GPU Max Operating Temp      : N/A
    Memory Current Temp         : N/A
    Memory Max Operating Temp   : N/A