Questions about system throttling due to overcurrent

Hi all,

We are working on a heavy load GPU application that involves inference on an Xavier NX with Jetpack 4.6. To obtain maximum performance we are using the 20W 6CORE predefined mode. When running our application we see several times the “System throttled due to overcurrent” message.

We’ve seen in other posts such as this and this that one workaround is to set the current maximum limit to 5A, however it has been reported that with high GPU loads it won’t work. That is our case, increasing the maximum limit is not enough to solve the problem. So, we would like to understand it better with the following questions:

  1. Is this warning related to temperature or power consumption? If not then, could you explain what triggers it?

  2. Is it harmful for the device to just run the application and ignore the warnings?

  3. What are the consequences of the system being throttled? Are the clocks frequencies lowered? Will the performance drop?

Thanks!

I think The error message "System throttling due to over-current" appears when running YOLOv4 - #27 by WayneWWW alraedy explained this.

  1. This is related to power consumption. Really indicating that there is a over current happened.

  2. If you don’t change the maximum limit to another higher value, then it won’t harm the device.

  3. The performance will be dropped.

This mechanism is to protect the device from being over-current. It means the performance of your NX has been pushed to limit.

(https://jetson-tools.nvidia.com/powerestimator/ )

Thanks for your answers @WayneWWW, that is really helpful. A couple of questions more:

  1. You mentioned that it is related to power consumption, so is there a known limit of power consumption before the warning is triggered? Something like maybe we could monitor with the tegrastats tool?

Also, to elaborate more on how is the performance dropped:

  1. Does the system switch power modes to a lower one? I’ve noticed that when the warning shows and we query the current model with nvpmodel -q it remains unchanged.
  2. Are the clock frequencies lowered for the CPU or the GPU? And if so, do you know at what values?

Sorry for the late response, is this still an issue to support? Thanks

Hi,

The official solution is using the the power estimator to calculate your the power budget of the usecase. Then you can create your own nvpmodel for that usecase.

When the throttling happened, the performance (cpu/gup/emc freq) will be dropped for a moment. Thus, it will not be easy to observed by tegrastats unless the throltting keep happening.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.