Power measurements using nvidia-smi

Did anyone measure power on GTX 1080 using nvidia-smi tool? How accurate are those numbers? I’m trying to calculate performance/watt for cublas sgemm.

Thanks

I don’t own a GTX 1080, but here are some potentially relevant reminder about GPU power measurements with nvidia-smi:

[1] The built-in sensors queried by nvidia-smi are not highly accurate, individually calibrated, devices. An error margin of +/- 5% should be assumed.

[2] GPU power draw is somewhat dependent on operating temperature, presumably due to Ohmic components in both the GPU and on-board memory power draw. This temperature-dependent component can be in the 10% region. Since the GTX 1080 is an actively cooled card, where the fan(s) contribute to the total power consumption, higher operating temperature usually means the fans are spanning faster and draw more power. Steady-state is typically achieved after applying the same workload continuously for about five minutes.

[3] These cards have automatic clock boosting features that adjust GPU clocks upwards from the specified base-line operating frequency as power draw and temperature allow. Because of manufacturing tolerances in all the components that go into a GPU, the clock boost that is achieved can vary between individual cards, even if they are all the same SKU from the same vendor.

[4] GPUs have a configurable power limit, and the “Enforced Power Limit” is usually configured lower than the “Maximum Power Limit” by default. You can change the power limit within the interval specified by the minimum and maximum values reported with nvidia-smi.

[5] Don’t be surprised if you find that the observed power draw when running *GEMM is significantly below the specified maximum, and below the power draw from other workloads (e.g. certain Folding@Home kernels).

Thanks for the info. Does this nvidia-smi GPU power draw include DRAM power in addition to power drawn by fan?

My understanding is that the sensors queried by nvidia-smi are part of the power supply on the card and therefore report whole-card power consumption. I cannot prove that this is the case. It you cannot find language about this in existing documentation, but need confirmation, consider making an official request for clarification to NVIDIA.

No problem. I found this in nvidia-smi guide

Power Draw The last measured power draw for the entire board, in watts. Only available if power
management is supported. This reading is accurate to within +/- 5 watts.

So, it must be whole card power.

1 Like

Good find (“power draw for the entire board”). I am surprised that they state the tolerance level as an absolute number rather than a relative percentage. Based on my rudimentary knowledge of electronics, I would have thought it would have to be the latter.