Could be due to several factors including system setup, storage configuration, make sure the tensorrt used to optimize the model is the same as the tensorrt used for inference.
can you provide details on the platforms you are using?
Linux distro and version
GPU type
nvidia driver version
CUDA version
CUDNN version
Python version [if using python]
Tensorflow version
TensorRT version
To help us debug, can you share a small repro containing the model, inference code, and sample input data that demonstrate the performance difference?