Speeding up Deep Learning Inference Using TensorFlow, ONNX, and TensorRT

Originally published at: https://developer.nvidia.com/blog/speeding-up-deep-learning-inference-using-tensorflow-onnx-and-tensorrt/

Starting with TensorRT 7.0,  the Universal Framework Format (UFF) is being deprecated. In this post, you learn how to deploy TensorFlow trained deep learning models using the new TensorFlow-ONNX-TensorRT workflow. Figure 1 shows the high-level workflow of TensorRT. Figure 1. TensorRT is an inference accelerator. First, a network is trained using any framework. After a…

I am not able to do
import engine as eng

I am getting :
ModuleNotFoundError: No module named 'engine'

What should I install for this?

Can you check this ?Download the code examples in this post.

In the section “Creating the TensorRT engine from ONNX”, please copy the code in a file engine.py then you can import that file.

1 Like

The code for doing inference using TensorRT cannot work with flask API?

an error caused at stream = cuda.Stream()
with error msg:
pycuda._driver.LogicError: explicit_context_dependent failed: invalid device context - no currently active context?

when I tried to add:
cfx = cuda.Device(0).make_context()

do inference


new errors will show

any idea how to solve it?


I already have the .onnx files for these models: InceptionV1, V3 and V4.

How much will the scripts: engine.py and buildEngine.py and inference.py will have to be changed?

Have you done any example with those models?

Thank you

If you already have .onnx files, you need to modify the scripts accordingly under the same workflow in your own way.
Or, for latency benchmark, you can try with ‘trtexec’ tool referring to https://github.com/NVIDIA/TensorRT/blob/master/samples/opensource/trtexec/README.md#example-4-running-an-onnx-model-with-full-dimensions-and-dynamic-shapes

1 Like

Will your code work with TensorRT 7.1?

I am checking this GIT onnx-tensorrt and that is where I found this image.

Have you done the same in your code? Or is it better to install a version lower than 7.1 to reuse your code?

Adding info: On this post, we used onnx 1.6.0 (OPSET 11).

Thanks, what about the Tensorflow version ?

1 Like

For reference, this is the Dockerfile I used:

FROM nvcr.io/nvidia/tensorflow:20.03-tf1-py3
WORKDIR /workspace
ADD requirements.txt .
RUN pip install -r requirements.txt
# docker build -t tf:20.03-tf1-py3 .
# docker run -it -u $(id -u):$(id -g) -v $(pwd):/workspace --rm tf:20.03-tf1-py3 bash



Keras ==2.3.1


These code snippets do not work with TF2.0. Can you please rectify them? Especially loadResNet.py

1 Like


I am using the TX2 which has a shared memory.
Is your code using the Unified Memory? If not, can you give me some clues on how to implement it using PyCUDA?

Thank you

Hi! We have added the TF2 code example in the post.

Has anyone run this? The code has errors:

def load_engine(trt_runtime, plan_path):
   with open(engine_path, 'rb') as f:
       engine_data = f.read()
   engine = trt_runtime.deserialize_cuda_engine(engine_data)
   return engine

engine_path does not exist. Is this supposed to be plan_path?

@loophole64 – Good catch! I’ve made that fix.

1 Like

Unsupported ONNX data type: UINT8 (2)
[TensorRT] ERROR: [network.cpp::getInput::1589] Error Code 3: Internal Error (Parameter check failed at: optimizer/api/network.cpp::getInput::1589, condition: index < getNbInputs()

getting this error, while building model

Hi sumeshthkr1,
We don’t support UINT8 type natively in TRT. We only support INT8 for calibration but not UINT8.

I have .pb file and .onnx file and engine.py file saved with my onnx model shape.
But when I,m trying to create engine I am getting an error saying-
import engine as eng
#from engine import build_engine
import argparse
from onnx import ModelProto
import tensorrt as trt

engine_name = “engine.plan”
onnx_path = “/home/hipe/Documents/code/training/model/model.onnx”
batch_size = 1

model = ModelProto()
with open(onnx_path, “rb”) as f:

d0 = model.graph.input[0].type.tensor_type.shape.dim[1].dim_value
d1 = model.graph.input[0].type.tensor_type.shape.dim[2].dim_value
#d2 = model.graph.input[0].type.tensor_type.shape.dim[3].dim_value
shape = [batch_size , d0, d1]# d2]
engine = eng.build_engine(onnx_path, shape= [110,16])
eng.save_engine(engine, engine_name)

AttributeError: module ‘engine’ has no attribute ‘build_engine’

I have tried import engine from engine import build_engine and also other ways as well. but not able proceed.

specifically I am getting this error at this part----> engine = eng.build_engine(onnx_path, shape= [110,16])

Would appreciate any help to resolve this error.
Thanks in advance!