TensorRT model inference is slower than normal model


I converted my yolov4 model to trt model to run in jetson Xavier nx. But after inferencing the trt model, i found out that it took around 2-3 secs to detect a image , which is considered slower than the normal model which took only a sec.

I came across a WARNING while running the trt inference.

[TensorRT] WARNING: TensorRT was linked against CuDNN 7.6.5 but loaded against CuDNN 7.5.1


TensorRT Version:
GPU Type: P8 (Aws Deep learning 30.0 ubuntu 18.04 instance)
Nvidia Driver Version: 440.33.01
CUDA Version: 10.2
CUDNN Version: 7.5.1
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): NO pytorch used
Baremetal or Container (if container which image + tag):

The CUDA,CuDNN is pre-installed in the Aws Deep learning 30.0 ubuntu 18.04 instance.

Hoping to hear soon,

Hi @saikrishnadas666,
Can you please help me with your model and script so that i can check it at my end.

I have used yolov4 coco model.
The script and method i followed can be taken from https://github.com/jkjung-avt/tensorrt_demos.git ( Converstion of YOLOv4 to trt model )

Sorry for late response. Are you using AWS P2 instance for this script testing or running on Jetson Xavier NX?
In case you are using P2 instance, it has Tesla K80 with compute capability of 3.7. Will recommend to try G4 or P3 instances instead.
Please refer below link

In case of Jetson Xavier, could you please share the Jetpack version.


I use jetson nx to run the inference script.
Jetpack 4.4

Request you to raise issue in https://github.com/jkjung-avt/tensorrt_demos/issues