Xavier NX restarts while running AI models

We are using our custom developed controller board with Jetson Xavier NX chip. This setup runs fine when not using any Deep Learning models.
We are running Deep Learning models using Triton server. Whenever we query these ensemble models, we see a current draw spike. This sometimes leads to the entire system restarting. This problem is mostly caused by the power drawn by the GPUs when AI inference is run. This problem does not occur with the Xavier NX development kit. But I would like to solve this using software and not make any changes to the hardware controller board that we have developed.
Can we limit the power drawn by the device ? Or is there any other way I can solve this ?

TensorRT Version :
GPU Type : NVIDIA Volta architecture
Nvidia Driver Version :
CUDA Version : 11.4
CUDNN Version : 8.6
Operating System + Version : Jetpack 5.1
Python Version (if applicable) : 3.8
TensorFlow Version (if applicable) : NA
PyTorch Version (if applicable) : NA
Baremetal or Container (if container which image + tag) : Docker


Please try to set the power model to see if this helps.


$ sudo nvpmodel -m [ID]


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.