[Jetson-TK1] Reducing power consumption

Hi all,

In short my question is: Is it possible to reduce the power consumption of the jetson-tk1?

I am doing experiments with the power consumption of the Tegra K1 SoC and I need to eliminate as much energy consumption as possible from the board. I have disconnected all peripherals except the ethernet port, and I run a stripped version of ubuntu only with openssh-server.

I already know that the fan draws about 1.5 watts, and that the board, in idle state, draws around 4 watts. Can also mention that I have connected an INA219 power measurement monitor over the I2C interface, and doing online readings of power from the CPU. :) I guess I can get rid of the fan power draw by installing a passive cooling device.

You could set any unused I/O ports to tristate open.

If you are not using HDMI, I think disabling that saves some power, but it may have an effect on CUDA apps.

Could you share some details (schematics) about your INA setup? I’m sure that would be interesting to many.

To reduce power draw, I recommend you do all 4 of these:

  1. Follow the various sections in Controlling the CPU performance to reduce the CPU speed of the 4 cores to be as low as possible for your application.

  2. Follow Controlling GPU performance to reduce the GPU speed as much as is possible for your application.

  3. Also, if you don’t need anything displayed on a screen then make sure nothing is plugged into your HDMI port, and possibly also disable it.

  4. And if you won’t be running intense code for long periods, then replace the fan with a heatsink or heatspreader.

@ShervinE,

Could you provide some potential hints as to why I’m having trouble with #1 on your list:

https://devtalk.nvidia.com/default/topic/783330/embedded-systems/disable-4-main-cores-and-run-on-5th-low-power-core/

BTW, disabling HDMI can help a lot! Furthermore you can save ~1 watt by just lowering the resolution as this stresses the HDMI even more (this surprised me a bit).

EDIT: Changed #2 --> #1, ie having trouble turning CPU:s on or off

Also don’t forget that USB devices draw power from the host…fewer USB devices and USB lower power devices means less power. If you want to cheat, then use a powered USB hub which removes Jetson as a power source…not exactly lower power total, but does remove what Jetson does.

Actually, I have found that setting “powersave” frequency scaling governor actually increases the energy consumption of the board in total. This is because the board has a high idle consumption from other components (about 4 Watts, where 1.5 W is the fan).

Which is partly the reason why I ask for other methods to reduce power of the various board components :) But thank you for the input, I will definitely see if turning off HDMI completely has any effect.

Aside from that, I use no other IO than the ethernet port.

Could you elaborate on this? The idle consumption is more in the range of 160 mA => 1.92 watt… but are you telling me that peripheral components increase their power consumption as the CPUs are downclocked?

Yes, I can explain this from viewpoint of some tests I’ve been running.

I’m running a set of experiments where I encode videos under various optimisation choices (such as NEON, GPU offloading and different algorithms). I also run the encoding under different frequency scaling governors.

When I use the performance governor, the encoding process can take as little as 6 seconds. However, under powersave, the frequency is scaled down, making the runtime very long (several minutes). If you consider only the CPU’s energy consumption, then yes, you may save energy under the powersave governor contra the others such as “performance”. However, the long runtime makes the whole platform consume more energy in total.

For example, let’s say the board has an idle draw of 4 W. The CPU has an individual, additional power draw of 10 mW at 2 MHz, and 1 W at 2.0 GHz.

At the lowest CPU frequency, a program takes 10 minutes to run. This translates to a total platform energy consumption of

E = PT = (4 + 0.01) * ((1060)/3600) = 668 mWh.

The “CPU only” additional energy consumption in powersave is 0.01*((10*60)/3600) = 1.60mWh.

For the highest frequency setting, the runtime is 10 s. Total energy consumption is

E = (4 + 1) * (10/3600) = 13mWh - this is much lower than for powersave (668mWh)

The “CPU only” additional energy consumption is 1*(10/3600) = 2mWh. This is a little higher than powersave’s “CPU additional” energy consumption (1.6 mWh).

So, the frequency scaling governor can optimise the consumption of the CPU only, not including additional power from other components. In practice this is true if you have some work you’re waiting on to finish.

In practice these things are more complicated, though, because the CPU power is governed by the infamous “frequency scaling formula”. Core power relies on dynamic power (switching activity in the processor that occurs as a result of program execution) and static power (transistor leakage). These terms again depend on core voltage, number of transistors in the core, core frequency, core utilisation, core voltage and core capacitance. I submitted a poster about this to NVIDIA GPU technology conference this year, hope I get through ;) But this should sum up what I mean. Sorry for the long explanation…

This is not my area of expertise, but I was under the impression that so-called “race-to-halt”, “race-to-sleep”, or “race-to-idle” approaches were already recognized as the best way to minimize energy use of computational devices. That is, run the device (be it CPU, GPU, etc) at top speed to complete a computational task as quickly as possible and enter a deep sleep mode as soon as possible, with the overall goal of maximizing the time the device spends in deep sleep mode.

Does computing in a hybrid environment such as the Jetson-TK1 change or modify this conventional wisdom about energy-efficient computing in a significant way? I am aware that some environments require power efficiency rather than energy efficiency, e.g. due to thermal or electrical limitations.

That’s what I’m trying to find out. In the case of the jetson, race-to-finish is best (in my scenario) because of the high idle power of external peripherals. The story would be different if the idle power was lower (close to zero). Which is probably the case for a production-mobile-device. In this case, my hypothesis is that you would want to keep the core frequency as low as possible while meeting application deadlines.

That being said, power and energy efficiency go hand in hand…

Thanks, that makes perfect sense. I don’t think there are any points to downclocking a heavily utilized resource.

In my case the GPU is heavily utilized while the CPU not so much, hence I’m attempting to either shutdown entire cores or just run on the 5th low power core.

Re: That being said, power and energy efficiency go hand in hand…

In my thinking, there is the potential for conflict between the methods used to achieve power efficiency and energy efficiency. A race-to-idle methodology would indicate that a GPU should always run at the maximum boost clock whenever it is active. But this could violate power and thermal limits, which instead may force clocks to be lowered for particularly intense workloads (e.g. FurMark) while maintaining power-efficient operation.

It is not just clocks frequencies playing into this, code construction or generation can play a role as well, for example trade-offs between re-computation and memory access (whether to internal cache or external). When coupled with further complications based on static versus dynamic power draw mentioned above it is not clear to me how closely power efficiency and energy efficiency go hand in hand.

I think that in practice on a GPU you’ll have to elaborate a bit with for example the shader and memory clocks to find a sweet spot. It’s definetly a non-linear optimization problem where one can expect near linear behaviour when tuning a certain parameter only in a small search area.

For example if your problem is memory bound you can lower the shader clocks a bit but only to a certain limit until you start becoming compute bound again.

Going back to the earlier discussion of how to switch to the 5th shadow core, it turns out that the Wiki was missing some info. So I’ve updated the Wiki to show how to check which cluster & cores & frequencies you are currently running at, and how to switch purely to the 5th shadow core:

Get this when I try to switch between HP and LP cluster:

dmesg
[768799.626174] cluster/active -> G
[768799.669087] cluster/active -> LP
[768799.710537] cluster/active -> G
[768799.766379] cluster/active -> LP
[768799.808053] cluster/active -> G
[768799.859534] cluster/active -> LP
[768799.901064] cluster/active -> G
[768799.959857] cluster/active -> LP
[768799.960429] tegra12_cpu_cmplx_clk_set_parent: Failed to switch cpu mode to cpu_lp
[768799.977889] tegra12_cpu_cmplx_clk_set_parent: aborted switch from cpu_g to cpu_lp
[768799.996271] cluster/active: request failed (-16)

Dear Moderator. I need maxmize the performance without considering the power. According to the instruction. I have maxmize the CPU freq. But I come across some issues when I am trying this command:

# echo 852000000 > /sys/kernel/debug/clock/override.gbus/rate
echo: write error: No such file or directory

I am sure I have root, sudo also doesn’t work. I tried to go to /sys/kernel/debug/clock/override.gbus/ and use Vim (with root) to modify rate file. but I still got error. I am sure the rate file is not missing…

"rate" E667: Fsync failed
WARNING: Original file may be lost or damaged
don't quit the editor until the file is successfully written!

same thing happened with file “state”
BTW. My tk1 version is 19.3
Thx a lot