TensorRT inference is slower than tensorflow model

I am trying to implement a keras model on TX2 and wanted to reduce the inference time using tensorRT. So I first converted keras model into .pb model using


Now, I run the following code to optimize the model using tensorRT:

trt_graph = trt.create_inference_graph(
    #output_saved_model_dir = '/home/nvidia/Desktop/alexnet/alexnet/'

The inference time is slower than the keras model. What should I do to improve the inference time?

I have a similar issue. I’m running two SSD networks on Tensorflow and here are my results:

Tensorflow Model: 20.81 FPS (i7, NVIDIA GTX 1050Ti), 5.3 FPS (Jetson Nano)
TensorRT Optimized FP16 Model: 20.22 FPS (i7, NVIDIA GTX 1050Ti), 4.2 FPS (Jetson Nano)

I have also noticed that sometimes the SSD network shows lot of false detectors on Jetson Nano while showing no such issue on a laptop.

Were you successful in solving the problem?