Xavier NX OC Throttling under heavy GPU load

I"m trying to work through the “System throttled due to Over-Current” mentioned here as well as other posts. I’m in 15W 6 core mode running yolov4. I’m watching the instantaneous current draw via sudo watch -n .1 cat /sys/bus/i2c/drivers/ina3221x/7-0040/iio:device0/in_current0_input and it appear the current is being limited to 3.0 amps. in_voltage0_input is steady at 5.1V which makes sense (3*5 = 15W). As a test I set warn_current_limit0 to 3400 and that allowed that current to go up to about 3200 mA (and it made the warning go away!)

On this page there’s a curious note that says:

The VDD_IN rail overcurrent thresholds for average current are 3 A for 10W and 15W power modes, and 4 A for 20W power modes. For instantaneous current they are 5 A for all power modes.

Since 4 amps is ok for average current in 20W mode, why wouldn’t it be ok in 10 or 15W mode? Can I set that manually 4000 mA in 15 W mode?

hello StrikeEagleIII,

the OC alarm/throttling warning message is not a bug.
this is in order to meet the hardware spec of Xavier NX and it’s used to protect the Xavier NX hardware.

you may see OC event increases when heavy loading application is running. in addition, you should also check whether cpu/gpu freq drops after OC event happens.

please use power estimator to customize nvpmodel for each use case. https://jetson-tools.nvidia.com/powerestimator.

I get that–this something that is new on the NX and I’m trying to wrap my head around it–on the TX2 I could set max performance mode, load up both the CPU and GPU, no big deal (I guess it’s not really new…on the TX2 you could create your own custom power modes, it just happened that the presets were sufficient and appear to have never allowed you to get into an OC condition).
I’m trying to understand why current limit is set at 3000 mA in 15W and 4000 mA in 20 W mode. I guess that that’s how power is limited to 15W or 20W respectively and that the 3000mA limit is only used to limit power to 15W, not to protect the NX…the 20W (4000mA) is the real limit and we’re free to “spend” that power wherever we want…CPU, GPU, DLA or where ever.
I think maybe what would clear up the confusion is if instead of giving the power modes names like “15W 6 core” which don’t really tell you what that mode is optimized for (CPU vs GPU, etc) if the names were more descriptive and had frequency limits on each mode were such that you couldn’t get an OC alarm. Or maybe it would just be sufficient if you could see what the different modes were trading off right in the nvpmodel gui without having to dig deep into the power management docs.

hello StrikeEagleIII,

previous release version (such as JetPack-4.4.1) also reproduce the same failure, actually.
however, the previous release does not contain the OC ALARM notification support, that’s why you did not see the warnings.

please check developer guide for your product design,
note, increasing OC limit higher than default value for your finalize products isn’t a suggest solution.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.