Great. So there are several approaches to running inference on a TensorFlow graph in TensorRT.
- Using python and a technology called UFF following instructions like the ones at the github repo you linked earlier: https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
The section under “detection” shows how to run the model for inference. There are further instructions for this technique at
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#tensorflowworkflow and a sample at https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#mnist_uff_sample.
A common feature that all of these have is that you use the TensorFlow python environment to run the model but the model is fundamentally being run in TensorFlow. Once you’ve exported to UFF the only part of TensorFlow you are using is the python interpreter. This means that you can also use the same (or very similar) commands in a python interpreter that isn’t also running the TensorFlow module.
Once you’ve used the tools above to get the model running in TensorRT you can also save engine to a file using
trt.utils.write_engine_to_file(). That file can be re-read using just the C++ interface (no Python required). Basically you open the file using standard C/C++, deserialize the engine, and then call the enqueue() method. This can be done in a really light-weight standalone C++ application or, of course, integrated into a larger C++ application.
All of the above techniques rely on TensorRT to perform all the inference. This works when all the model layers are supported in TensorRT. If you want to mix custom TensorFlow operators with TensorRT graph execution there is new technique available in TensorFlow 1.7
This gives you the best of both worlds. The flexibility of tensorflow and much of the performance of TensorRT. To learn more check out these two blogs
With this approach running inference is just the same as running inference in TensorFlow … TensorFlow takes care of Running TRT on the appropriate sub-graph for you.
Hope this helps!