Speeding up Deep Learning Inference Using TensorFlow, ONNX, and TensorRT

jwitsoe · March 27, 2020, 10:40pm

Originally published at: https://developer.nvidia.com/blog/speeding-up-deep-learning-inference-using-tensorflow-onnx-and-tensorrt/

Starting with TensorRT 7.0, the Universal Framework Format (UFF) is being deprecated. In this post, you learn how to deploy TensorFlow trained deep learning models using the new TensorFlow-ONNX-TensorRT workflow. Figure 1 shows the high-level workflow of TensorRT. Figure 1. TensorRT is an inference accelerator. First, a network is trained using any framework. After a…

sparsh-b · April 2, 2020, 6:01pm

I am not able to do
import engine as eng

I am getting :
ModuleNotFoundError: No module named 'engine'

What should I install for this?

joshpqxnpp · April 3, 2020, 5:38am

Can you check this ?Download the code examples in this post.

habbasian · April 3, 2020, 4:16pm

In the section “Creating the TensorRT engine from ONNX”, please copy the code in a file engine.py then you can import that file.

xzhka1229 · April 5, 2020, 1:41pm

The code for doing inference using TensorRT cannot work with flask API?

an error caused at stream = cuda.Stream()
with error msg:
pycuda._driver.LogicError: explicit_context_dependent failed: invalid device context - no currently active context?

when I tried to add:
cfx = cuda.Device(0).make_context()

do inference

cfx.pop()

new errors will show

any idea how to solve it?

Aizzaac · April 23, 2020, 9:28pm

Hi

I already have the .onnx files for these models: InceptionV1, V3 and V4.

How much will the scripts: engine.py and buildEngine.py and inference.py will have to be changed?

Have you done any example with those models?

Thank you

joshpqxnpp · April 23, 2020, 9:53pm

Hi
If you already have .onnx files, you need to modify the scripts accordingly under the same workflow in your own way.
Or, for latency benchmark, you can try with ‘trtexec’ tool referring to https://github.com/NVIDIA/TensorRT/blob/master/samples/opensource/trtexec/README.md#example-4-running-an-onnx-model-with-full-dimensions-and-dynamic-shapes

Aizzaac · April 24, 2020, 7:58pm

Will your code work with TensorRT 7.1?

I am checking this GIT onnx-tensorrt and that is where I found this image.

Have you done the same in your code? Or is it better to install a version lower than 7.1 to reuse your code?

anon7417063 · April 14, 2020, 11:33pm

Adding info: On this post, we used onnx 1.6.0 (OPSET 11).

anon68544234 · April 19, 2020, 9:24pm

Thanks, what about the Tensorflow version ?

anon68544234 · April 20, 2020, 12:10am

For reference, this is the Dockerfile I used:

FROM nvcr.io/nvidia/tensorflow:20.03-tf1-py3 WORKDIR /workspace ADD requirements.txt . RUN pip install -r requirements.txt # docker build -t tf:20.03-tf1-py3 . # docker run -it -u $(id -u):$(id -g) -v $(pwd):/workspace --rm tf:20.03-tf1-py3 bash

requirements.txt:
keras keras2onnx onnx==1.6.0 pycuda tf2onnx tensorrt

anon7417063 · April 23, 2020, 4:36pm

Keras ==2.3.1
keras2onnx==1.6.0
onnx==1.6.0
pycuda==2019.1.2
tf2onnx==1.6.0

tensorrt 7.0.0.11

jashshah92 · October 1, 2020, 7:03am

These code snippets do not work with TF2.0. Can you please rectify them? Especially loadResNet.py

Aizzaac · October 24, 2020, 3:25pm

Hello

I am using the TX2 which has a shared memory.
Is your code using the Unified Memory? If not, can you give me some clues on how to implement it using PyCUDA?

Thank you

yutec · March 18, 2021, 10:52pm

Hi! We have added the TF2 code example in the post.

loophole64 · April 28, 2021, 6:20pm

Has anyone run this? The code has errors:

def load_engine(trt_runtime, plan_path):
   with open(engine_path, 'rb') as f:
       engine_data = f.read()
   engine = trt_runtime.deserialize_cuda_engine(engine_data)
   return engine

engine_path does not exist. Is this supposed to be plan_path?

jwitsoe · April 28, 2021, 10:12pm

@loophole64 – Good catch! I’ve made that fix.

sumeshthkr1 · September 15, 2021, 7:04pm

Unsupported ONNX data type: UINT8 (2)
[TensorRT] ERROR: [network.cpp::getInput::1589] Error Code 3: Internal Error (Parameter check failed at: optimizer/api/network.cpp::getInput::1589, condition: index < getNbInputs()
)

getting this error, while building model

yutec · September 30, 2021, 6:05pm

Hi sumeshthkr1,
We don’t support UINT8 type natively in TRT. We only support INT8 for calibration but not UINT8.

lgn17 · June 27, 2022, 3:16pm

Hello,
I have .pb file and .onnx file and engine.py file saved with my onnx model shape.
But when I,m trying to create engine I am getting an error saying-
import engine as eng
#from engine import build_engine
import argparse
from onnx import ModelProto
import tensorrt as trt

engine_name = “engine.plan”
onnx_path = “/home/hipe/Documents/code/training/model/model.onnx”
batch_size = 1

model = ModelProto()
with open(onnx_path, “rb”) as f:
model.ParseFromString(f.read())

d0 = model.graph.input[0].type.tensor_type.shape.dim[1].dim_value
d1 = model.graph.input[0].type.tensor_type.shape.dim[2].dim_value
#d2 = model.graph.input[0].type.tensor_type.shape.dim[3].dim_value
shape = [batch_size , d0, d1]# d2]
engine = eng.build_engine(onnx_path, shape= [110,16])
eng.save_engine(engine, engine_name)

AttributeError: module ‘engine’ has no attribute ‘build_engine’

I have tried import engine from engine import build_engine and also other ways as well. but not able proceed.

specifically I am getting this error at this part----> engine = eng.build_engine(onnx_path, shape= [110,16])

Would appreciate any help to resolve this error.
Thanks in advance!

Topic		Replies	Views
[TensorRT] ERROR: Network must have at least one output TensorRT tensorrt	29	2318	September 30, 2021
Onnx -> tensorrt fp32 conversion performance degradation different outputs TensorRT tensorrt , pytorch , onnx	4	2001	November 29, 2022
Assertion Error in buildMemGraph: 0 (mg.nodes[mg.regionIndices[outputRegion]].size == mg.nodes[mg.regionIndices[inputRegion]].size) TensorRT	10	1290	October 12, 2021
Problem converting ONNX model to TensorRT Engine for SSD Mobilenet V2 Jetson Nano tensorrt , nvbugs , ssd , onnx	38	8728	October 18, 2021
I am trying to convert the ONNX SSD mobilnet v3 model into TensorRT Engine. I am getting the below error Jetson TX2 tensorrt , tensorflow	24	3676	February 17, 2022
LSTM ONNX to TensorRT mismatched outputs TensorRT tensorrt	3	932	September 29, 2022
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1387	July 12, 2022
Tensorrt8.5 inference different with origin onnx model TensorRT	6	1064	December 13, 2022
ONNX Model and Tensorrt Engine gives different output TensorRT tensorrt , onnx	13	5238	June 29, 2022
Tensorflow model acceleration on AGX Jetson AGX Xavier tensorflow	14	1183	October 7, 2022

Speeding up Deep Learning Inference Using TensorFlow, ONNX, and TensorRT

do inference

Related topics