No speed up in inference on jetson agx with tensorflow 2 and tensorRT


I converted tensorflow 2 model (in saved model format) to TensorRT model, but there was no speed gain. I tried models with SSD and FasterRCNN architectures both same result, no improvement in speed. Model input is (1,256,1280,3), the inferene speed I get is 3FPS

Tried two conversion codes, one provided from Nvidia tutorials:

import numpy as np
import tensorflow as tf
from tensorflow.python.compiler.tensorrt import trt_convert as trt

output_saved_model_dir = ‘’
input_saved_model_dir = ‘’

conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS
conversion_params = conversion_params._replace(max_workspace_size_bytes=(30000000000))
conversion_params = conversion_params._replace(precision_mode=“FP16”)
conversion_params = conversion_params._replace(maximum_cached_engines=100)
conversion_params = conversion_params._replace(is_dynamic_op=True)

converter = trt.TrtGraphConverterV2(input_saved_model_dir=input_saved_model_dir,conversion_params=conversion_params)

And one from Tensorflow tutorials:

import tensorflow as tf

output_saved_model_dir = ‘’
input_saved_model_dir = ‘’

params = tf.experimental.tensorrt.ConversionParams(precision_mode=‘FP16’, maximum_cached_engines=100, is_dynamic_op=True,max_workspace_size_bytes=30000000000)

converter = tf.experimental.tensorrt.Converter(input_saved_model_dir=input_saved_model_dir, conversion_params=params)

Both produced model in save model format, but there was no gains in speed.

Jetson agx is flashed with jetpack 4.4 and tensorflow version is 2.3.1.

Inference code is from tensorflow sample.

Maybe there is something that I need to do to model before conversion, or change some parameters of conversion script ?

Also, maybe someone is running Tensorflow 2 with tensorRT on their Jetson AGX and can share what model they use and how many FPS they get ?

Thanks for the replies.


Please noted that not all TensorFlow operations can be converted into TensorRT.
Here is the list of supported operations:

It’s recommended to check the ratio of the non-supported layer inside your model first.
You can find the following document for details:

Based on our latest benchmark data, SSD Mobilenet-V1 should reaches 1919fps with standalone TensorRT API on Xavier.


Thanks for the reply,

Do I understand right, that TF2 supports only 5 operations ?


Also I checked the nodes in my model.
In not converted model I get 70 nodes, in TensorRT converted model I get same 70 nodes plus 1 TRTEngineOp node, so in total 71 node.

So does this mean that it’s not possible to get speed up with tensorflow 2 and TensorRT ?


Based on this, it’s recommended to use standalone TensorRT API.
TensorRT can support both SSD and FasterRCNN but TF-TRT may not.

You can find some example for the conversion in our GitHub: