Inference time increases in for loop


Hi, I am using this repo ( to convert a yolov5 network to .engine model.

I have an issue regarding inference time. If I follow this script: wang-xinyu/tensorrtx/blob/master/yolov5/

I can see a slow but gradual increase in inference time. Is the solution in a for loop optimal?
Is it recommended to do another procedure to predict a large number of images?

I am very beginner when it comes to optimising models with respect to inference time.

Thanks in advance


TensorRT Version: TensorRT:
GPU Type: Jetson Orin
Nvidia Driver Version:
CUDA Version: 8.7
CUDNN Version:
Operating System + Version: 20.04.5 LTS (Focal Fossa)"
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Steps To Reproduce

Explained in tensorrtx/yolov5 at master · wang-xinyu/tensorrtx · GitHub


Please try the Nvidia official sample and let us know if you still have this problem.

Also we recommend you to please use the latest TensorRT version 8.5.3.

Thank you.

thanks for the reply. I am using yolov5 not yolov3. Is there a tutorial to convert this model (in .pt format) to TensorRT in an official way?
Thanks in advance