Export to frozen inference graph

Is it possible (or will it be) to export a custom trained .tlt (or.etlt) model to a conventional TensorFlow frozen inference graph (.pb) to make inferences with traditional TF tools? How can I do that?

Hi ignacio,
Sorry, but currently our workflow is only compatible with DeepStream or https://github.com/NVIDIA-AI-IOT/deepstream_4.x_apps.
See https://devtalk.nvidia.com/default/topic/1065558/transfer-learning-toolkit/trt-engine-deployment/ for more info.

Hi, thanks for your answer, I understand.

If I am right on the research made, the possibilities to make inference with an image-classification model on the Jetson environment are mainly two:

On one hand, to run inference on an *.etlt model over what you call ‘Native TRT’ with the TRT-engine (C++) optimised to run on the GPU and available through DS4 SDK. On the other hand, I can run inference over proper backend for example Tensorflow with te TF frozen graph *.pb.

It is also possible to optimise TF *.pb frozen model with TF-TRT recent integration, but it will not run over TRT-engine and I do not need DS4, only TF python app implementation. Am I right?

If this conclusion is ok, can I convert a TensorFlow frozen graph *.pb to native TRT model *.etlt using the transfer-learning-toolkit (TLT) or another available tool, so it can run over native TRT-engine? How is that possible?

I made this questions because I’am dealing with two issues right now over a Jetson Nano:

First one is when I

tf.import_graph_def(trt_graph, name='')

, where
*.pb graph importation is made. I read that perhaps protobuffer implementation over python could be ineficient (https://devtalk.nvidia.com/default/topic/1046492/tensorrt/extremely-long-time-to-load-trt-optimized-frozen-tf-graphs/2), so instead of using protobuf python implementation they suggest to use C++ implementation to accelerate the starting process.

Second one is the time spent to run inference over a [240,240,3] image, since I am inferring over a live video I need to infer at least at 4fps but I am not able right now on this arquitecture (it is lasting ~800ms) so I would like to know how to fully convert my *.pb Tensorflow frozen graph (that I have previously optimised with TF-TRT integration) to Native TRT to get rid of Tensorflow backend. If this is not possible I suspect I will need to start transfer learning again over your available pretrained classification models using TLT to get an *.etlt that can run over native TRT-engine.

I apreciate from now each comment, verification and correction you could make on my previous conclusions and questions.

BR.

Hi Ignacio,
In TLT, it is not supported for end user to convert a TensorFlow pb file to etlt model.
In order to get an etlt model, it is necessary to follow TLT process to train your own data.

Hi Morganh, thxs for your answrer.

Ok, but *.etlt isn’t the same thing as Stand-Alone TensorRT, right? Because I found in TF-TRT users guide https://docs.nvidia.com/deeplearning/frameworks/pdf/TensorFlow-TensorRT-User-Guide.pdf that it is possible to generate a Stand-Alone TesorRT plan on 2.10-How To Generate A Stand-Alone TensorRT Plan. This will run with no need to have Tensorflow installed and over TensorRT C++ engine right? Will this improve performance on Inference vs Optimised Tensorflow with TF-TRT integration?

Thank you.

BR.

Hi Ignacio,
TLT provides a tool, tlt-converter, to generate trt engine from etlt model.
Please get more details from the user guide or tlt docker container’s Jupyter notebook.