TF-TRT vs TensorRT

LoveNvidia · January 17, 2020, 5:44pm

Hi,
I found that we can optimize the Tensorflow model in several ways. If I am mistaken, please tell me.

1- Using https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html (TF-TRT)
, This API developer by tensorflow and integreted TensoRT to Tensorflow and this API called as :

from tensorflow.python.compiler.tensorrt import trt_convert as trt

This API can be applied to any tensorflow models (new and old version models) without any converting error, because If this API don’t support any new layers, don’t consider these layers for TensorRT engines and these layers remain for Tensorflow engine and run on Tensorflow. right?

2- Using https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#overview (TensorRT), This API by developed by NVIDA and is independent of Tenorflow library (Not integrated to Tensorflow), and this API called as:

import tensorrt as trt

If we want to use this api, first, we must converting the tensorflow graph to such as UFF or ONXX using uff-convertor and then parse the UFF graph to this API.
In this case, If the Tensorflow graph have unsupported layers we must use plugin or custom code for these layers, right?

3- I don’t know, when we work with Tensorflow models, Why we use UFF/ONNX converter and then parse them to TensorRT, we can use directly TF-TRT API, right? If so, Are you tested the Tensorflow optimization model from these two method to get same performance? what’s advantage of this UFF/ONNX converter method?

I have some question about the two cases above:
4- I convert the ssd_mobilenet_v2 using two cases, In the case 1, I achieve slight improvement in speed but in the case 2, I achieve more improvement, why?
My opinion is that, In the case 1, The API only consider converting the precision (FP32 to FP16) and merging the possible layers together, But in the case 2, the graph is clean by UFF such as remove any redundant nodes like Asserts and Identity and then converted to tensorrt graph, right?

5- when we convert the trained model files like .ckpt and .meta, … to frozen inference graph(.pb file), These layers don’t remove from graph? only loss states and optimizer states , … are removed?

AastaLLL · January 20, 2020, 12:59pm

HI,

1. YES. TF-TRT convert the supported layer into TensorRT.
For those non-supported one, it use TensorFlow original implementation instead.

2. The pipeline should looks like .pb → .uff → TensorRT engine.
It’s recommended to check our support matrix first:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html
For the non-supported layer, yes, you will need to implement it with our plugin API.

3. TensorFlow usually have poor performance on Jetson, especially the huge required memory.
For TF-TRT, although it part of the layers have TensorRT acceleration, the overall interface is still TensorFlow(data input/output, …).
It’s expected that pure TensorRT will give you a much better performance.

4. Please check no.3.

5. YES.

Thanks.

Topic		Replies	Views
Difference between TF-TFT and uff->tensorrt Jetson Nano	4	784	October 14, 2021
Getting started witth Tensorflow to TRT conversion Jetson Xavier NX	4	1116	October 18, 2021
Tensor RT TensorRT	1	343	July 16, 2020
TensorRT and Tensorflow: convert to uff failed Jetson TX2	43	14802	October 18, 2021
How does TF to TRT work? Jetson TX2	4	1083	October 18, 2021
Getting Tensorflow based mobilenet SSD to run with TensorRT for inference speed up TensorRT	1	1978	October 3, 2018
How to run tflite model on Jetson nano Jetson Nano tensorrt	4	3408	October 18, 2021
TensorRT 3.0 RC now available with support for TensorFlow Jetson TX2	83	23707	May 21, 2018
TensorRT optimization of Keras model on Jetson TX2 TensorRT	3	1726	August 8, 2018
Convert custom Tensorflow model to TensorRT Jetson Nano	12	5219	October 14, 2021

TF-TRT vs TensorRT

Related topics