I am trying to implement a keras model on TX2 and wanted to reduce the inference time using tensorRT. So I first converted keras model into .pb model using
Now, I run the following code to optimize the model using tensorRT:
trt_graph = trt.create_inference_graph( input_graph_def=frozen_graph, outputs=output_names, max_batch_size=1, max_workspace_size_bytes=1<<31, precision_mode='FP16', #output_saved_model_dir = '/home/nvidia/Desktop/alexnet/alexnet/' )
The inference time is slower than the keras model. What should I do to improve the inference time?