How to serialize a model and do inference in Python

I’ve followed the DEEP LEARNING SDK DOCUMENTATION to learn to use TensorRT on TX2.

What I want to do is load a model from onnx (converted from mxnet) and convert it to a engine (save it) and do inference. The following is my current code (exclude inference):

import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

model_path = './model.onnx'

builder = trt.Builder(TRT_LOGGER)
network = builder.create_network()
parser = trt.OnnxParser(network, TRT_LOGGER)

with open(model_path, 'rb') as model:
    parser.parse(model.read())

builder.max_batch_size = 5
builder.max_workspace_size = 1 << 20

engine = builder.build_cuda_engine(network)

with open('sample.engine', 'wb') as f:
    f.write(engine.serialize())

However, the engine seems to be None and I don’t know what is wrong and what to do next. The code in DEEP LEARNING DSK DOCUMENTATION is not very clear to me.

Many thanks!
Wenbin Xu

Hello,

which specific DEEP LEARNING SDK DOCUMENTATION are you refrencing? and what do you mean by “engine seems to be None”? To help us debug, can you provide details on the platforms you are using?

Linux distro and version
GPU type
nvidia driver version
CUDA version
CUDNN version
Python version [if using python]
Tensorflow version
TensorRT version

Any usage/source file you can provide will help us debug too.

Hi NVES,

This is the documentation I referred:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#python_topics

With the execution of line 19, it raises an error says that engine is of None type. So it mean that in line 16, ‘builder.build_cuda_engine(network)’ returns None. So I’m not able to proceed.

details on the platforms:

Linux distro and version: Ubuntu 18.04 (from JetPack4.2)
GPU type: Tegra TX2
nvidia driver version: Cannot find (installed JetPack4.2)
CUDA version: 10.0
CUDNN version: 7.3.1
Python version [if using python]: python 3.6
Tensorflow version: not used
TensorRT version: 5.0.6-1

Thanks

I had the same issue. The problem is that the parser may fail silently, see this post: https://devtalk.nvidia.com/default/topic/1048404/parser-error-output/?offset=3#5340034

BTW, you should try out onnx-tensorrt: GitHub - onnx/onnx-tensorrt: ONNX-TensorRT: TensorRT backend for ONNX (which will take care of some of the busy work around TensorRT)

Hi daniel:

Thank you for your suggestion, I’ll have a try.

My model does not contain slice op, just conventional cnn layers. I assume the codes in tutorial would work for my case, but it somehow fails.

Hi!

Did you manage to solve this issue?

Thank you

Svetlana