Speeding up Deep Learning Inference Using TensorFlow, ONNX, and TensorRT

Originally published at: https://developer.nvidia.com/blog/speeding-up-deep-learning-inference-using-tensorflow-onnx-and-tensorrt/

Starting with TensorRT 7.0,  the Universal Framework Format (UFF) is being deprecated. In this post, you learn how to deploy TensorFlow trained deep learning models using the new TensorFlow-ONNX-TensorRT workflow. Figure 1 shows the high-level workflow of TensorRT. Figure 1. TensorRT is an inference accelerator. First, a network is trained using any framework. After a…

I am not able to do
import engine as eng

I am getting :
ModuleNotFoundError: No module named 'engine'

What should I install for this?

Can you check this ?Download the code examples in this post.

In the section “Creating the TensorRT engine from ONNX”, please copy the code in a file engine.py then you can import that file.

1 Like

The code for doing inference using TensorRT cannot work with flask API?

an error caused at stream = cuda.Stream()
with error msg:
pycuda._driver.LogicError: explicit_context_dependent failed: invalid device context - no currently active context?

when I tried to add:
cfx = cuda.Device(0).make_context()

do inference


new errors will show

any idea how to solve it?


I already have the .onnx files for these models: InceptionV1, V3 and V4.

How much will the scripts: engine.py and buildEngine.py and inference.py will have to be changed?

Have you done any example with those models?

Thank you

If you already have .onnx files, you need to modify the scripts accordingly under the same workflow in your own way.
Or, for latency benchmark, you can try with ‘trtexec’ tool referring to https://github.com/NVIDIA/TensorRT/blob/master/samples/opensource/trtexec/README.md#example-4-running-an-onnx-model-with-full-dimensions-and-dynamic-shapes

Will your code work with TensorRT 7.1?

I am checking this GIT onnx-tensorrt and that is where I found this image.

Have you done the same in your code? Or is it better to install a version lower than 7.1 to reuse your code?

Adding info: On this post, we used onnx 1.6.0 (OPSET 11).

Thanks, what about the Tensorflow version ?

For reference, this is the Dockerfile I used:

FROM nvcr.io/nvidia/tensorflow:20.03-tf1-py3
WORKDIR /workspace
ADD requirements.txt .
RUN pip install -r requirements.txt
# docker build -t tf:20.03-tf1-py3 .
# docker run -it -u $(id -u):$(id -g) -v $(pwd):/workspace --rm tf:20.03-tf1-py3 bash



Keras ==2.3.1