Inconsistent TensorRT Inference Time on Jetson Xavier NX

I am writing to seek assistance regarding an issue I am experiencing with TensorRT inference times on my NVIDIA Jetson Xavier NX board. When running an engine model, the inference time occasionally increases dramatically. For instance, the inference time sometimes shows 25 milliseconds , but subsequent runs can show 45 milliseconds or even 65 milliseconds.

Observations and Details:

  • Device : NVIDIA Jetson Xavier NX
  • Issue Description :
    • Initial inference time: ~25 milliseconds
    • Subsequent inference times: ~45 milliseconds to ~65 milliseconds
  • Model : Object detection yolov5.engine model
  • Power Supply : 5V, 4A
  • Jetpack Version 5.1.2
  • Temperature : Maximum observed temperature is 40-42°C
  • Cooling Measures : The board is equipped with a heat sink and a fan
    Max clock frequency (sudo jetson_clocks), all 6 cores are active.

Does the time increase when you wait longer between inference?

Did not get you ? could you please elaborate you statement little bit

Is the latency shorter when you make inference one after another quickly, and longer if you wait some time before making next inference?

longer when wait some time before doing another inference

1 Like

It occurs with TensorRT. Not sure about the reason.