Detectron2 model converted into Tensorrt taking large time

Description

I have converted my custom trained detectron2 model into tensorrt format and infer the script on 314 images .Main Detectron2 model is taking 58 seconds but tensorrt converted model is taking 258 seconds for the same images and in the same gpu environment.
But as per my understanding tensorrt should take lesser time
Please help in finding the cause of this slowness.

Environment

TensorRT Version: 8.4.3.1
GPU Type: Tesla T4
Nvidia Driver Version: 510.47.03
CUDA Version: 11.6
Operating System + Version: Red Hat Enterprise Linux 8.5 (Ootpa)
PyTorch Version (if applicable): 1.8

Relevant Files

i am followed the above GitHub repo for the conversion of model into tensorrt engine .

Hi,

Could you please share with us the verbose logs and if possible time comparison scripts to try from our end for better debugging?

Thank you.