TensorRT inference is slower than tensorflow model

I am trying to implement a keras model on TX2 and wanted to reduce the inference time using tensorRT. So I first converted keras model into .pb model using

https://github.com/amir-abdi/keras_to_tensorflow

Now, I run the following code to optimize the model using tensorRT:

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1<<31,
    precision_mode='FP16',
    #output_saved_model_dir = '/home/nvidia/Desktop/alexnet/alexnet/'
     
)

The inference time is slower than the keras model. What should I do to improve the inference time?

I have a similar issue. I’m running two SSD networks on Tensorflow and here are my results:

Tensorflow Model: 20.81 FPS (i7, NVIDIA GTX 1050Ti), 5.3 FPS (Jetson Nano)
TensorRT Optimized FP16 Model: 20.22 FPS (i7, NVIDIA GTX 1050Ti), 4.2 FPS (Jetson Nano)

I have also noticed that sometimes the SSD network shows lot of false detectors on Jetson Nano while showing no such issue on a laptop.

Were you successful in solving the problem?