I want to get the lightweight model converted to onnx


I would like to use your software(tensorRT) skills.
I have a problem I’m trying to solve now, but it doesn’t work.
I have a question.

It was judged that your software technology has the function of reducing model weight and compiling.

Currently, there is no need for a function to compile, only a technique for reducing the weight of the model is required.
A function to compile later is required.

That’s why I want to use only model lightweighting technology.

I wonder if there is a way to lighten the model.onnx model using your software and get the model_quant.onnx model back.

So far, no matter how much I search, I don’t think that feature is available.

thanks in advance for reply


TensorRT Version: 8.4
GPU Type: A6000
Nvidia Driver Version: 510
CUDA Version: 11.4
CUDNN Version: 8
Operating System + Version:
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet


import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging

Thank you for answer.

But I don’t think you need the script I’m going to give you.

Here’s what I was talking about:

  1. model.onnx
  2. tensorRT quantization + optimizer = model.trt
  3. model_quant_optimizer.onnx

I want to make an onnx file with quantization and optimizer applied using only the quantization and optimizer functions of tensorRT in the onnx file.
I don’t want the trt file.
The method of using the trt file will be used in the next project.
I am currently wondering if it is possible to create an onnx file using only the quantization and optimizer functions of tensorRT.

If the above method does not work, I wonder if it is possible to convert a trt file to an onnx file.

thank you


Above both are not possible I think. You need to ultimately port the ONNX model to TensorRT.
In case you’re looking for training models at reduced precision,
Please refer TensorFlow-Quantization toolkit and PyTorch Quantization Toolkit as mentioned here.

Thank you.