Power consumption difference between AGX and NX

I’m trying to benchmark power consumption between the Xavier AGX and Xavier NX when configured almost identically. We are trying to keep the power consumption down to around 19W on each Xavier.

I have a custom nvpmodel which is pasted below and is used for the AGX and the same is used for the NX. The clocks are reduced so that we can achieve running in a 19W footprint.

The results show the AGX can consume 19W using a 700 Mhz GPU and 1.95 Ghz CPU cores. The NX power consumption is much higher, around 24W. The test is started (by hand), followed by the stress-ng load, followed by the nbody_opengles load (nbody_opengles is from cuda samples).

Why is the NX could be so much higher? Or why is the AGX power consumption so much lower? Both Nvidia devkits are using Jetpack 4.5.1. Both are Nvidia devkits. Both are connected to displays so i could run the nbody_opengles commands from a console window on the display.

On the AGX, I am running these command-line tools on separate consoles:

$ stress-ng --cpu 8 --cpu-load 80

$ ./nbody_opengles -numbodies=20000

$ sudo tegrastats

The results for the AGX are below. The horizontal axis is number of seconds.

image

AGX nvpmodel:

< POWER_MODEL ID=7 NAME=MODE_7 >
CPU_ONLINE CORE_0 1
CPU_ONLINE CORE_1 1
CPU_ONLINE CORE_2 1
CPU_ONLINE CORE_3 1
CPU_ONLINE CORE_4 1
CPU_ONLINE CORE_5 1
CPU_ONLINE CORE_6 1
CPU_ONLINE CORE_7 1
TPC_POWER_GATING TPC_PG_MASK 0
GPU_POWER_CONTROL_ENABLE GPU_PWR_CNTL_EN on
CPU_DENVER_0 MIN_FREQ 1000000
CPU_DENVER_0 MAX_FREQ 1950000
CPU_DENVER_1 MIN_FREQ 1000000
CPU_DENVER_1 MAX_FREQ 1950000
CPU_DENVER_2 MIN_FREQ 1000000
CPU_DENVER_2 MAX_FREQ 1950000
CPU_DENVER_3 MIN_FREQ 1000000
CPU_DENVER_3 MAX_FREQ 1950000
GPU MIN_FREQ 318750000
GPU MAX_FREQ 700000000
GPU_POWER_CONTROL_DISABLE GPU_PWR_CNTL_DIS auto
EMC MAX_FREQ 1600000000
DLA_CORE MAX_FREQ 896000000
DLA_FALCON MAX_FREQ 524800000
PVA_VPS MAX_FREQ 550400000
PVA_CORE MAX_FREQ 409600000
CVNAS MAX_FREQ 460800000

On the NX, I am running these command-line tools in separate consoles:

$ stress-ng --cpu 6 --cpu-load 80

$ ./nbody_opengles -numbodies=20000

$ sudo tegrastats

The results for the NX are below. The horizontal axis is the number of seconds.

image

NX Nvpmodel:

< POWER_MODEL ID=15 NAME=MODE_6CORE_GPU >
CPU_ONLINE CORE_0 1
CPU_ONLINE CORE_1 1
CPU_ONLINE CORE_2 1
CPU_ONLINE CORE_3 1
CPU_ONLINE CORE_4 1
CPU_ONLINE CORE_5 1
TPC_POWER_GATING TPC_PG_MASK 1
GPU_POWER_CONTROL_ENABLE GPU_PWR_CNTL_EN on
CPU_DENVER_0 MIN_FREQ 1000000
CPU_DENVER_0 MAX_FREQ 1950000
CPU_DENVER_1 MIN_FREQ 1000000
CPU_DENVER_1 MAX_FREQ 1950000
CPU_DENVER_2 MIN_FREQ 1000000
CPU_DENVER_2 MAX_FREQ 1950000
GPU MIN_FREQ 318750000
GPU MAX_FREQ 700000000
GPU_POWER_CONTROL_DISABLE GPU_PWR_CNTL_DIS auto
EMC MAX_FREQ 1600000000
DLA_CORE MAX_FREQ 896000000
DLA_FALCON MAX_FREQ 524800000
PVA_VPS MAX_FREQ 550400000
PVA_CORE MAX_FREQ 409600000
CVNAS MAX_FREQ 576000000

if I reduce the amount of computation required of the GPU to 50% at 700 Mhz (numbodies=10000 instead of 20000), I see the power consumed by the NX drop to about 19W consumption and is the same as the AGX. The GPU “load” reported by tegrastats oscillates around 50%.

How should I interpret this result? This seems to mean that the AGX GPU and system are twice as efficient as the GPU system on the NX, within the same power footprint, correct?

I’m not attempting extreme precision here, just trying to understand the behavior of the NX vs AGX, roughly.

image

Hi,

What gpu and cpu freq/utilization from tegrastats tell you when the power consumption is same between NX and Xavier?

What gpu and cpu freq/utilization from tegrastats tell you when the power consumption is same between NX and Xavier?

I’m assuming you meant “between Xavier NX and Xavier AGX”?

https://docs.nvidia.com/jetson/l4t/#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/AppendixTegraStats.html#

There are 6x power rails for Xavier AGX and 3x power rails for Xavier NX. Summing them together should equal the total power consumption of the respective SOM (Xavier NX or AGX).

I use this ^^ for GPU statistics like frequency and utilization.

I think there is no actual “interpretation” to use power consumption to indicate the hardware capabilities. I mean the AGX Xavier should indeed have better performance than NX, but there is no guarantee to say something like “ it still can double the frequency” so xavier agx should have twice performance than NX. No such guarantee.

but there is no guarantee to say something like “ it still can double the frequency” so xavier agx should have twice performance than NX. No such guarantee.

This is confusing to me. Who is saying what, exactly?

I did not say anything about double the frequency. The frequency is the same on each SOC. Please read the nvpmodel XML entries I am using above. The nvpmodel entries are the same. The frequency ranges of each GPU should be the identical. The frequency ranges of each CPU should be identical between Xavier NX and Xavier AGX.

I double-checked the results that I recorded. The result show the GPU and CPUs stayed within the boundaries i setup in each nvpmodel for both Xavier NX and Xavier AGX.

Originally, I said:

How should I interpret this result? This seems to mean that the AGX GPU and system are twice as efficient as the GPU system on the NX, within the same power footprint, correct?

The last graph I shared shows 19W being consumed by Xavier NX when I run both stress-ng and nbody_opengles on the Xavier NX using the same nvpmodel settings as on a my Xavier AGX.

I assume (perhaps incorrectly) that the Xavier SOCs (NX and AGX) are the same except for bandwidth of paths to memory. The bandwidth of Xavier AGX is substantially higher than Xavier NX. The bandwidth consumed is shown by tegrastats to be 7% of a Xavier NX’s EMC bandwidth.

If I reduce the amount of work sent to the Xavier NX GPU, then the power consumption drops to be the same as the Xavier AGX

What gpu and cpu freq/utilization from tegrastats tell you when the power consumption is same between NX and Xavier?

BTW, the comment here was to ask the result from your tegrastats.

Actually, I have to sorry that I don’t get what is the exact questions here. NX takes more power than AGX Xavier? Xavier NX also has wifi module on it which may cause extra overhead (but not that much). We can provide you the internal power consumption result if you need.

Apologies. I unplugged the Xavier WiFi module from underneath before I ran the power comparisons.

Actually, I have to sorry that I don’t get what is the exact questions here.

Exact original question from above:

Why is the NX could be so much higher? Or why is the AGX power consumption so much lower?

Exact original question restated more clearly, I hope:

Why does the Xavier NX consume 24W when running identical test situation (see above) to the Xavier AGX which consumes only 19W?

I am assuming the Xavier NX SOC and Xavier AGX SOC are nearly identical. Perhaps they are not identical in some way that Nvidia understands and I do not understand.

Therefore, if the Xavier NX and Xavier AGX SOCs are identical and:

  • Are running identical Jetpack OS
  • Are using identical clock frequencies in nvpmodel
  • Are running identical tests
  • Are not exercising bandwidth

Then I expect the power consumption to be… identical (or very close). But the power consumption is different by a large amount of 5W.

Let me check the internal test result and report here.

BTW, the comment here was to ask the result from your tegrastats.

From Xavier NX when running both stress-ng and nbody_opengles:

RAM 2635/7766MB (lfb 982x4MB) SWAP 0/3883MB (cached 0MB) CPU [96%@1905,99%@1905,95%@1902,99%@1905,89%@1905,95%@1905] EMC_FREQ 6%@1600 GR3D_FREQ 99%@701 VIC_FREQ 0%@115 APE 150 MTS fg 1% bg 0% AO@53.5C GPU@56C PMIC@100C AUX@54C CPU@55C thermal@54.75C VDD_IN 13597/13614 VDD_CPU_GPU_CV 10542/10536 VDD_SOC 1223/1223

From Xavier AGX running as configured in 1st message:
RAM 3213/15817MB (lfb 2488x4MB) SWAP 0/7908MB (cached 0MB) CPU [95%@1958,91%@1958,83%@1958,99%@1958,86%@1958,97%@1958,92%@1958,100%@1958] EMC_FREQ 4%@1600 GR3D_FREQ 99%@675 VIC_FREQ 0%@115 APE 150 MTS fg 0% bg 0% AO@49C GPU@50.5C Tdiode@49.75C PMIC@100C AUX@48.5C CPU@52C thermal@50C Tboard@44C GPU 5185/4748 CPU 7622/7472 SOC 2135/2056 CV 0/0 VDDRQ 305/291 SYS5V 3456/3410

Xavier NX Power says 13597mw + 10542mw + 1223mw == 25362mw or 25 .362 watts
Xavier AGX Power says 5185mw + 7622mw + 2135mw + 0mw + 305mw + 3456mw == 18703mw or 18.703 watts

The difference between 25.362 watts and 18.703 watts is 6.659 watts.

If I have mis-calculated, please show me the location of my mis-calculation.

andy

Therefore, if the Xavier NX and Xavier AGX SOCs are identical and:

Xaver NX SOC and Xavier AGX SOC are not identical in another way related to GPU.

Xavier NX SOC has 384 CUDA Cores. Xavier AGX SOC has 512 CUDA Cores. I would not expect such a large power consumption of 6.6W because of Xavier NX only having 75% of the CUDA cores as AGX. However, I am open to explanation from Nvidia that this is the reason why.

Please notice the total GPU power consumption on Xavier AGX is 5.185 watts. The difference of 6.659 watts is larger than the entire power consumption of the AGX GPU for this test. The difference of 6.659 watts seems large when compared to the total GPU power consumption of the Xavier AGX GPU.

Is your NX a sdcard based on or emmc?

The Total module power consumed by Xavier-NX is reported by VDD_IN rail where as on Xavier-AGX it is sum of all six rails (CPU+GPU+SOC+CV+VDDRQ+SYS5V) .

Based on your tegrastats log:
Xavier-NX total module power = 13597mW
Xavier-AGX total module power = 5185mw + 7622mw + 2135mw + 0mw + 305mw + 3456mw = 18703mw

The Xavier-NX consumes (18703mW - 13597mW) 4710mW lesser power than Xavier-AGX.

Is your NX a sdcard based on or emmc?

My module is from a standard Xavier NX devkit using sdcard.

Is the power for an eMMC Xavier NX module calculated differently?

The Total module power consumed by Xavier-NX is reported by VDD_IN rail

Interesting. You are probably correct. I was misled by the Nvidia documentation (below) for the 3 power rails on the Xavier NX which clearly state that Channel 0 is “System 5V power rail” and not VDD_IN. I am assuming that tegrastats simply reports the contents of the /sys/fs nodes that are exposed by Linux.

https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/power_management_jetson_xavier.html#wwpID0E0MS0HA

However, I see that rail_name_0 (in /sys/bus/i2c/drivers/ina3221x/7-0040/iio:device0/) is labeled as VDD_IN, so it seems likely you are correct about VDD_IN being the total module power. The location of the INA3221 in sysfs is also different (7-0040 vs 1-0040) in Jetpack 4.5.x.

root@andy-nx:/home/andy# cat /sys/bus/i2c/drivers/ina3221x/7-0040/iio:device0/rail_name_0
VDD_IN

Can the documentation be updated?