No performance improvement for Tensorflow TensorRT model on converted on Jetsons Xavier NX

Hi,

We are trying to convert a tensorflow Faster RCNN model from saved model format to TensortRT FP32/FP16.

We were able to successfully conver the models, however inference times did not show any improvements. We used the follow code to convert:

import tensorflow
from tensorflow.python.compiler.tensorrt import trt_convert as trt

tf_saved_model_dir = "./infer_model_512Imsize/saved_model"

converter = trt.TrtGraphConverter(input_saved_model_dir=tf_saved_model_dir,
                                  max_batch_size=1,
                                  max_workspace_size_bytes=4294965097,   # 4GB
                                  precision_mode='FP16',
                                  minimum_segment_size=3,
                                  is_dynamic_op=False,
                                  maximum_cached_engines=1,
                                  use_calibration=False)

converter.save("./trt_converted_models/FRCNN_FP16_512Imsize_100Proposals")

The inference times are:

  1. Tensorflow saved model - 136 ms
  2. TensorRT FP16 model - 143 ms

Thanks,
Regards,

Krishna

Hi,

In TF-TRT, it fallback a TensorRT non-supported layer into TensorFlow implementation.
If the model frequently switches between TensorFlow and TensorRT, the data transfer overhead will be high.

You can find the support matrix of TF-TRT below:

https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#tf-115-20

This information is also available on the console during converting.

Thanks.