We installed some S1070 and would like to monitor their temperature to make sure we don’t reduce them to some expensive copper nuggets. Therefore we tried to use nvidia-smi to get the temperature values, but sadly we found out that it only seems to support the older S870. Is there a version that will work with our S1070? Is there any other method to get the temperatures?
How are you invoking nvidia-smi ?
Are you stating that the nvidia-smi.log contains no data?
Which driver are you using?
I am invoking nvidia-smi without any parameters.
I upgraded to 180.06 as the 177.80 told me "Failed to get chip temperature for device 0xa55a0000, err 29 ". Now 180.06 claims “Could not get Thermal info!”. As I was for some reason expecting the tool to write the temperature to the console I thought it wasn’t working, and the help message only speaks of the old Tesla.
However now that you mention the log file … it seems to contain the temperatures. However only for the cores, the intake and exhaust temperatures as well as the fan readings are all at 0 and I know the fans are running. For the earlier driver only the LED state seems to be working. Thanks for the clarifacation.
Please try this driver:
http://www.nvidia.com/object/linux_display…_177.70.18.html
I’ve filed bug 494135 for the issues with 180.06.
That one seems to work and properly report intake and fan status, too.