We are using our custom developed controller board with Jetson Xavier NX chip. This setup runs fine when not using any Deep Learning models.
We are running Deep Learning models using Triton server. Whenever we query these ensemble models, we see a current draw spike. This sometimes leads to the entire system restarting. This problem is mostly caused by the power drawn by the GPUs when AI inference is run. This problem does not occur with the Xavier NX development kit. But I would like to solve this using software and not make any changes to the hardware controller board that we have developed.
Can we limit the power drawn by the device ? Or is there any other way I can solve this ?
TensorRT Version : 8.5.2.2
GPU Type : NVIDIA Volta architecture
Nvidia Driver Version :
CUDA Version : 11.4
CUDNN Version : 8.6
Operating System + Version : Jetpack 5.1
Python Version (if applicable) : 3.8
TensorFlow Version (if applicable) : NA
PyTorch Version (if applicable) : NA
Baremetal or Container (if container which image + tag) : Docker