read temperature without X11 GeForce 295 GTX second GPU fault?

I am experiencing problems with “device 1” of new 295 when used as GPGPU compute only.
Not sure what they are but device 1 sometimes gives different answer to device 0.
I thought I would look at how hot it is when running a calculation.

To avoid kernel time out, neither device connected to X11.
Hence nvidia-settings does not show anything useful.
I am hoping someone might have some code to read the equivelnt of
NV_CTRL_GPU_CORE_TEMPERATURE which works without X-windows.

Any help or comments would be very welcome


nvidia-smi can show gpu temperatures, and it can be run as a background process, polling at controlled intervals and writing to a log. That should get you what you need. If you run it with -h you can see the full range of options.

nvidia-smi -l -i 60 -lsa
spits out
Temperature : 48 C (the idle half)
Temperature : 65 C (GPU 1: which is in use)

ps: The top half of the GeForce 295 has been running for more than 4 hours without the error being detected.
So perhaps 150F is ok?
pps: The nVidia Spec says 105 C,
(Hmm may be this is why I am not seeing the error…)

A little gnuplot script to display historical trend in two 295 GTX temperature
should be attached.
