Importing a ONNX model for performing an inference using TensorRT

josemaria.rodriguez · April 30, 2021, 8:46am

Hi.

I have written a Python program for building an inference engine from an ONNX model using a “Jetson Nano 2GB” Board.

Previously, I created a notebook for training a neural network using TensorFlow and Keras in order to make predictions using the MNIST dataset (handwriten digits).

After the model was trained, I exported it using the ONNX format.

However, when the Python program tries to create the inference engine, an error message related with the specification of the input shape appears.

I am studying the TensorRT Developer Guide, but I have not able to find how to solve the error.

I attach the Python program.sample.py (4.5 KB)

If necessary, I can also share the notebook used for creating and saving the ONNX model from the MNIST Keras model.

Thanks.

AastaLLL · May 3, 2021, 2:54am

Hi,

Do you meet an error like " Unsupported ONNX data type: UINT8 ".

This is a known issue since ONNX by default use INT8 as input data type while TensorRT expects FP32.
You can fix this issue by modifying the data format with our ONNX Graphsurgeon API.

Please check below comment for the detailed information:

Thanks.

josemaria.rodriguez · May 6, 2021, 8:15am

Thank you for your answer.

In fact, my ONNX model is generated with INT64 weights, and the ONNX parser tries to cast them down to INT32. A warning message related to this conversion is shown when I run the program “sample.py”.

However, I’m afraid that the errors have more to do with the input shape.

In this sense, I have modified the source code of the program which makes the prediction, but it is still not working.

I attach the modified program, and the Google Colaboratory notebook I wrote for generating the ONNX file model (as a PDF file). This notebook trains the MNIST model and exports it to ONNX format.

In the Colab notebook, the statement that performs the conversion of the saved model to ONNX format is:

proc = subprocess.run('python -m tf2onnx.convert --saved-model MNIST_Keras ’
‘–output MNIST_Keras.onnx --opset 12’.split(),
capture_output=True)

I don’t know if the statement should be completed with more parameters in order to generate the ONNX model with INT32 weights and to solve the errors related with the input shape.

Finally, I also show the error messages:

josemari@nano:~/Documents/Py/MNIST_ONNX$ python3 sample.py -d data
WARNING: data/mnist does not exist. Trying data instead.
Parsing ONNX file and building engine…
[TensorRT] WARNING: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.
[TensorRT] ERROR: Network validation failed.
Engine creation failed. Exiting…
josemari@nano:~/Documents/Py/MNIST_ONNX$

That’s all. I look forward to your answer. Thank you.

sample.py (4.6 KB)
tf2onnx_MNIST.ipynb.pdf (95.8 KB)

AastaLLL · May 18, 2021, 6:58am

Hi,

ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.

This error indicates that the model is defined as dynamic shape but you didn’t provide the input dimension when inferencing.
To solve this, please run the trtexec with --explicitBatch and also specify the input dimension.

For more details, please check below comment:

Thanks.

josemaria.rodriguez · May 23, 2021, 8:25pm

Thanks for your response. Using the term –explicitBatch for searching in the Web, I have found a solution to my issue:

After the network is imported from the ONNX model, its input shape must be defined using a specific statement (see below).

def build_engine_onnx(model_file):
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(common.EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
builder.max_workspace_size = common.GiB(1)
with open(model_file, ‘rb’) as model:
success = parser.parse(model.read())
if not success:
for error in range(parser.num_errors):
print(parser.get_error(error))
return None
network.get_input(0).shape = [28, 28, 1]
return builder.build_cuda_engine(network)

I have executed my python program with this modification, and now it works.

Topic		Replies	Views
How to convert saved_model to onnx to run with Jetson Inference Jetson Nano tensorflow	5	2511	October 15, 2021
TensrFlow2.0 to run on Jetson Nano 2GB Jetson Nano tensorflow , nano2gb	4	534	October 15, 2021
Could not parse ONNX model from file TensorRT	9	3734	January 24, 2024
TensorRT only supports input K as an initializer TensorRT	9	3083	August 10, 2021
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	797	December 8, 2021
Error code 4 internal error unnamed layer TensorRT cudnn	9	1421	October 4, 2024
Mod operator unsupported in TensorRT 8.4.1 (included w/ Jetpack 5.0.2) TensorRT jetpack , tensorrt , cuda , jetson-inference , onnx	5	1562	January 2, 2023
Problem converting TensorFlow 2-> ONNX model to TensorRT Engine (efficientdet_d0) TensorRT	8	1398	November 17, 2022
ONNX to TensoRT conversion failing with error: "each train expected to have at most one ShapeHostToDeviceNode" Jetson Xavier NX tensorrt , pytorch	9	990	August 16, 2023
Inference error while using tensorrt engine on jetson nano Jetson Nano tensorrt , nvbugs	23	3638	April 20, 2022

Importing a ONNX model for performing an inference using TensorRT

Related topics