I have converted my custom trained detectron2 model into tensorrt format and infer the script on 314 images .Main Detectron2 model is taking 58 seconds but tensorrt converted model is taking 258 seconds for the same images and in the same gpu environment.
But as per my understanding tensorrt should take lesser time
Please help in finding the cause of this slowness.
TensorRT Version: 188.8.131.52
GPU Type: Tesla T4
Nvidia Driver Version: 510.47.03
CUDA Version: 11.6
Operating System + Version: Red Hat Enterprise Linux 8.5 (Ootpa)
PyTorch Version (if applicable): 1.8
i am followed the above GitHub repo for the conversion of model into tensorrt engine .