How to serialize a model and do inference in Python

Wenbin_Xu · May 16, 2019, 9:23am

I’ve followed the DEEP LEARNING SDK DOCUMENTATION to learn to use TensorRT on TX2.

What I want to do is load a model from onnx (converted from mxnet) and convert it to a engine (save it) and do inference. The following is my current code (exclude inference):

import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

model_path = './model.onnx'

builder = trt.Builder(TRT_LOGGER)
network = builder.create_network()
parser = trt.OnnxParser(network, TRT_LOGGER)

with open(model_path, 'rb') as model:
    parser.parse(model.read())

builder.max_batch_size = 5
builder.max_workspace_size = 1 << 20

engine = builder.build_cuda_engine(network)

with open('sample.engine', 'wb') as f:
    f.write(engine.serialize())

However, the engine seems to be None and I don’t know what is wrong and what to do next. The code in DEEP LEARNING DSK DOCUMENTATION is not very clear to me.

Many thanks!
Wenbin Xu

NVES · May 16, 2019, 4:30pm

Hello,

which specific DEEP LEARNING SDK DOCUMENTATION are you refrencing? and what do you mean by “engine seems to be None”? To help us debug, can you provide details on the platforms you are using?

Linux distro and version
GPU type
nvidia driver version
CUDA version
CUDNN version
Python version [if using python]
Tensorflow version
TensorRT version

Any usage/source file you can provide will help us debug too.

Wenbin_Xu · May 17, 2019, 1:31am

Hi NVES,

This is the documentation I referred:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#python_topics

With the execution of line 19, it raises an error says that engine is of None type. So it mean that in line 16, ‘builder.build_cuda_engine(network)’ returns None. So I’m not able to proceed.

details on the platforms:

Linux distro and version: Ubuntu 18.04 (from JetPack4.2)
GPU type: Tegra TX2
nvidia driver version: Cannot find (installed JetPack4.2)
CUDA version: 10.0
CUDNN version: 7.3.1
Python version [if using python]: python 3.6
Tensorflow version: not used
TensorRT version: 5.0.6-1

Thanks

daniel.suess · May 17, 2019, 3:58am

I had the same issue. The problem is that the parser may fail silently, see this post: https://devtalk.nvidia.com/default/topic/1048404/parser-error-output/?offset=3#5340034

BTW, you should try out onnx-tensorrt: GitHub - onnx/onnx-tensorrt: ONNX-TensorRT: TensorRT backend for ONNX (which will take care of some of the busy work around TensorRT)

Wenbin_Xu · May 17, 2019, 6:23am

Hi daniel:

Thank you for your suggestion, I’ll have a try.

My model does not contain slice op, just conventional cnn layers. I assume the codes in tutorial would work for my case, but it somehow fails.

svetlana.podvalniuk · July 8, 2020, 1:11pm

Hi!

Did you manage to solve this issue?

Thank you

Svetlana

Topic		Replies	Views
Builder.build_cuda_engine fails to build engine but prints no errors TensorRT	3	987	September 30, 2021
Runtime.deserialize_cuda_engine return a NoneType, how to fix ti? TensorRT tensorrt	10	2360	July 15, 2022
Builder.build_cuda_engine(network) silently returns None TensorRT	4	2755	January 10, 2024
Batch Inference Wrong in Python API TensorRT	15	3544	October 12, 2021
TensorRT deserialize_cuda_engine() returns a None Object TensorRT tensorrt	7	3582	October 12, 2021
Trouble deserialising a trt engine file TensorRT	1	1493	September 5, 2021
TRT 8.2.0.6 - 'NoneType' object has no attribute 'serialize' TensorRT	2	1940	February 1, 2023
Problem with onnx2trt for Mobilenetv2 model TensorRT	9	426	December 1, 2020
Deserialize cuda engine return none TensorRT	3	1281	January 25, 2022
Use tensorrt to inference met error Jetson Nano jetson-inference	4	1781	October 15, 2021

How to serialize a model and do inference in Python

Related topics