Exporting quantized model in ONNX file from TensorRT C++

alex.spivakovsky · October 19, 2021, 8:38am

I have a question considering a 8 bit Quantization flow. Currently I use the pytorch quantization toolkit to quantize the network and pytorch to export to ONNX. Finally I import the ONNX files into TensorRT using C++ framework and build and inference engine.

However, this approach leads to input shape constrains as the ONNX file holds a graph for a specific input shape used while exporting it.
I’m wondering if the input shape can be changed after importing the graph into the TensorRT? Or maybe I can save the weights and quantization scales in hd5 format, load them into C++ and then set the input shape for the inference?

NVES · October 19, 2021, 9:09am

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Topic		Replies	Views
Do quantization scales still need to be 0? TensorRT	1	355	November 24, 2021
Importing a ONNX model for performing an inference using TensorRT Jetson Nano tensorrt	5	3063	October 15, 2021
QLinearConv implementation in TensorRT and onnx model conversion TensorRT tensorrt	1	767	November 27, 2020
Importing ONNX model inTensorRT -- Jetson TX2 Jetson TX2	4	2221	October 18, 2021
TensorRT conversion issues of ONNX model trained with Quantization Aware Training + custom quantization scale TensorRT tensorrt	5	1472	April 14, 2021
Trtexec onnx inception_v3 TensorRT	8	1349	March 3, 2021
Question about converting onnx quantized model to tensorrt TensorRT tensorrt , onnx	2	1147	November 9, 2020
Problem with converting ONNX quantized models to TensorRT Jetson AGX Xavier tensorrt , onnx	6	1782	December 22, 2021
I faced an error during the quantization test using TensorRT 8 TensorRT tensorrt , tensorflow	3	767	November 9, 2021
Converting Pytorch model to ONNX to Tensorrt TensorRT	2	10394	February 15, 2019

Exporting quantized model in ONNX file from TensorRT C++

check_model.py

Related topics