Tuning Orin NX Custom Power Profile

Hello, I am trying to make a custom power profile for the Orin NX 8gb to try and improve CPU/GPU performance.

I have a script running that stresses CPU/GPU, and am monitoring various voltages/current/power/frequencies/temps for a set period of time.

I can see that even in nvpmodel mode 3 (20W) the CPU frequencies throttle.

When I switch to MAXN I can see it throttle even more frequently.

I believe that this throttling is occuring due to the internal SOM pmc hitting the hard 20W limit, but I don’t see any transients in temperature or voltage/current that seem to be indicative of why it’s throttling.

In dmesg, the only relevant message I see is:

[81928.408080] cpufreq: cpu4,cur:307000,set:1984000,set ndiv:155
[81928.530986] cpufreq: cpu0,cur:817000,set:1984000,set ndiv:155
[81928.655073] cpufreq: cpu0,cur:998000,set:1984000,set ndiv:155
[81928.655687] cpufreq: cpu0,cur:988000,set:1984000,set ndiv:155

Indicating the cpu is being scaled, but I am in performance mode on the governor (jetson_clocks.sh was run), so I anticipate it’s because of throttling of some sort.

Is there anything I can check to see why the CPU frequencies are throttling?

I have included a zip of 2 generated plots in html form, where 1 is in 20W mode and 1 is in MAXN mode. The temperatures, power, and frequencies are from the output of tegrastats, the current/voltages are the output of the SOM’s onboard INA3221 sensor, and the input voltage/current is from the carrier board-level input voltage rail.

orin_power_plots.zip (2.7 MB)

Hi vikram.bhat,

Are you using the devkit or custom board for Orin NX?
What’s your Jetpack version in use?

Please share the result of the following command on your board when you hit the throttling issue.

# grep "" /sys/class/hwmon/hwmon*/oc*

How did you perform this?
Have you referred to PowerEstimator (nvidia.com) to generate custom power mode configuration?

Hi KevinFFF,

Thanks for your reply.
I am using a custom board for the Orin NX, using Jetpack 5.1.2

using your command a few times under stress, and can clearly see the oc events tick up:

$ grep "" /sys/class/hwmon/hwmon*/oc*
/sys/class/hwmon/hwmon1/oc1_event_cnt:0
/sys/class/hwmon/hwmon1/oc1_throt_en:1
/sys/class/hwmon/hwmon1/oc2_event_cnt:861
/sys/class/hwmon/hwmon1/oc2_throt_en:1
/sys/class/hwmon/hwmon1/oc3_event_cnt:4255
/sys/class/hwmon/hwmon1/oc3_throt_en:1
$ grep "" /sys/class/hwmon/hwmon*/oc*
/sys/class/hwmon/hwmon1/oc1_event_cnt:0
/sys/class/hwmon/hwmon1/oc1_throt_en:1
/sys/class/hwmon/hwmon1/oc2_event_cnt:910
/sys/class/hwmon/hwmon1/oc2_throt_en:1
/sys/class/hwmon/hwmon1/oc3_event_cnt:4439
/sys/class/hwmon/hwmon1/oc3_throt_en:1
$ grep "" /sys/class/hwmon/hwmon*/oc*
/sys/class/hwmon/hwmon1/oc1_event_cnt:0
/sys/class/hwmon/hwmon1/oc1_throt_en:1
/sys/class/hwmon/hwmon1/oc2_event_cnt:976
/sys/class/hwmon/hwmon1/oc2_throt_en:1
/sys/class/hwmon/hwmon1/oc3_event_cnt:4672
/sys/class/hwmon/hwmon1/oc3_throt_en:1

I am modifying nvpmodel.conf, but I haven’t gotten to that step yet, I am just trying to understand the levers to turn right now, going between the already existing mode 0 (MAXN) and 3(20W).

I have consulted the power estimator, but also want experimentally collected evidence that the power profile I land on is stable.

Are there any more levers I can pull to tune besides just the cpu/gpu frequencies? Like tuning the voltage on certain rails?

It seems you hit OC2 and OC3 event.
You can refer to the following table that your power usage is higher than the threshold.

                              Orin NX 8GB
OC1 (under voltage)                4.5V
OC2 (VDD_IN average power)         20W
OC3 (VDD_IN intant power)          25W

Please check the power log from Jetson Power GUI or the result from tegrastats.
You can limit the frequency if the loading is not too high.
e.g. if your CPU usage is always lower than 50%, you can just configure the max CPU frequency to half.
Please configure your own power mode configuration based on the power usage in your use case.

Thanks KevinFFF.

I have the power log from tegrastats in the plots in my original post.

Currently we are in mode 3 (20W) and are seeing a few of those OC2 and OC3 events. (Not nearly as many as in MAXN but still some).

We were hoping to be able to boost either CPU or GPU frequency or both, but based on this information it seems like we are already on the edge of stability? (“Stable” being as little of those OC2 and OC3 events as possible I think). Is there anything we can tune (voltages or current limits) that will allow us to stay in 20W power mode or even increase either CPU or GPU freqs while remaining stable?

Are there any peripheral/hardware blocks we could turn off in order to save some power budget?

It is expected since 20W mode would limit some frequency from configurations.

Please share the power log from Jetson Power GUI or the result of running sudo tegrastats

We would suggest you get a custom power configuration could make there’s no OC event, or you would get throttling to decrease the performance.

I will work on getting you the direct log output of sudo tegrastats.

Would you say that in the custom power profile, I would need to clock down either the CPU or GPU in order to lower total power consumption in order to reduce the probability of throttling?

Exactly.

MAXN mode is an unconstraint power mode rather than the best performance mode. You have to create a custom power mode according to your use case/application to optimize the power consumption and the performance.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.