Importing a ONNX model for performing an inference using TensorRT


I have written a Python program for building an inference engine from an ONNX model using a “Jetson Nano 2GB” Board.

Previously, I created a notebook for training a neural network using TensorFlow and Keras in order to make predictions using the MNIST dataset (handwriten digits).

After the model was trained, I exported it using the ONNX format.

However, when the Python program tries to create the inference engine, an error message related with the specification of the input shape appears.

I am studying the TensorRT Developer Guide, but I have not able to find how to solve the error.

I attach the Python (4.5 KB)

If necessary, I can also share the notebook used for creating and saving the ONNX model from the MNIST Keras model.



Do you meet an error like " Unsupported ONNX data type: UINT8 ".

This is a known issue since ONNX by default use INT8 as input data type while TensorRT expects FP32.
You can fix this issue by modifying the data format with our ONNX Graphsurgeon API.

Please check below comment for the detailed information:


Thank you for your answer.

In fact, my ONNX model is generated with INT64 weights, and the ONNX parser tries to cast them down to INT32. A warning message related to this conversion is shown when I run the program “”.

However, I’m afraid that the errors have more to do with the input shape.

In this sense, I have modified the source code of the program which makes the prediction, but it is still not working.

I attach the modified program, and the Google Colaboratory notebook I wrote for generating the ONNX file model (as a PDF file). This notebook trains the MNIST model and exports it to ONNX format.

In the Colab notebook, the statement that performs the conversion of the saved model to ONNX format is:

proc ='python -m tf2onnx.convert --saved-model MNIST_Keras ’
‘–output MNIST_Keras.onnx --opset 12’.split(),

I don’t know if the statement should be completed with more parameters in order to generate the ONNX model with INT32 weights and to solve the errors related with the input shape.

Finally, I also show the error messages:

josemari@nano:~/Documents/Py/MNIST_ONNX$ python3 -d data
WARNING: data/mnist does not exist. Trying data instead.
Parsing ONNX file and building engine…
[TensorRT] WARNING: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.
[TensorRT] ERROR: Network validation failed.
Engine creation failed. Exiting…

That’s all. I look forward to your answer. Thank you. (4.6 KB)
tf2onnx_MNIST.ipynb.pdf (95.8 KB)


ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.

This error indicates that the model is defined as dynamic shape but you didn’t provide the input dimension when inferencing.
To solve this, please run the trtexec with --explicitBatch and also specify the input dimension.

For more details, please check below comment:


1 Like

Thanks for your response. Using the term –explicitBatch for searching in the Web, I have found a solution to my issue:

After the network is imported from the ONNX model, its input shape must be defined using a specific statement (see below).

def build_engine_onnx(model_file):
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(common.EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
builder.max_workspace_size = common.GiB(1)
with open(model_file, ‘rb’) as model:
success = parser.parse(
if not success:
for error in range(parser.num_errors):
return None
network.get_input(0).shape = [28, 28, 1]
return builder.build_cuda_engine(network)

I have executed my python program with this modification, and now it works.