Hello,
I’ve read those threads :
-System throttled due to over-current?
-The error message "System throttling due to over-current" appears when running YOLOv4
-AGX Orin System throttled due to Over-current pytorch
So i know that the overcurrent issue is known by NVIDIA.
I have a Jetson AGX Orin Devkit and it runs fine, but when it is in the “max power” mode, running a GPU intensive model will trigger the warning. Running the same model in 50W will not trigger it, but the inference will be almost half the speed.
I tried different models, the bigger ones trigger the same warning, while the smaller one are working without issue.
I have 2 questions :
-
When this warning is triggered, is the GPU frequency/power/etc… automatically limited (as the “throttled” seems to imply), and if not is there a risk to damage the hardware if this warning is triggered at every inference ?
-
In the latest topic I found, AastaLLL from NVIDIA mentionned that a custom power mode could be created following this :
NVIDIA Jetson Orin - Tuning Power - RidgeRun Developer Connection
But what exactly should be limited to maximize GPU speed without triggering the overcurrent warning ?
Is the overcurrent global and I would need to turn off/slow down some CPU cores, or is the overcurrent localized in the GPU and I need to limit its frequency (or else) ? (and then to which value ?)
Because given the number of parameters and possible combinations, benchmarking everything would be very long, and if every warning means putting a strain on the hardware, it wouldn’t be too good to do many experiments.
Thank you very much,
FV