TensorRT model inference is slower than normal model


I converted my yolov4 model to trt model to run in jetson Xavier nx. But after inferencing the trt model, i found out that it took around 2-3 secs to detect a image , which is considered slower than the normal model which took only a sec.

I came across a WARNING while running the trt inference.

[TensorRT] WARNING: TensorRT was linked against CuDNN 7.6.5 but loaded against CuDNN 7.5.1


TensorRT Version:
GPU Type: P8 (Aws Deep learning 30.0 ubuntu 18.04 instance)
Nvidia Driver Version: 440.33.01
CUDA Version: 10.2
CUDNN Version: 7.5.1
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): NO pytorch used
Baremetal or Container (if container which image + tag):

The CUDA,CuDNN is pre-installed in the Aws Deep learning 30.0 ubuntu 18.04 instance.

Hoping to hear soon,

Hi @saikrishnadas666,
Can you please help me with your model and script so that i can check it at my end.

I have used yolov4 coco model.
The script and method i followed can be taken from https://github.com/jkjung-avt/tensorrt_demos.git ( Converstion of YOLOv4 to trt model )