I want to get the lightweight model converted to onnx

Description

hello.
I would like to use your software(tensorRT) skills.
I have a problem I’m trying to solve now, but it doesn’t work.
I have a question.

It was judged that your software technology has the function of reducing model weight and compiling.

Currently, there is no need for a function to compile, only a technique for reducing the weight of the model is required.
A function to compile later is required.

That’s why I want to use only model lightweighting technology.

I wonder if there is a way to lighten the model.onnx model using your software and get the model_quant.onnx model back.

So far, no matter how much I search, I don’t think that feature is available.

thanks in advance for reply

Environment

TensorRT Version: 8.4
GPU Type: A6000
Nvidia Driver Version: 510
CUDA Version: 11.4
CUDNN Version: 8
Operating System + Version:
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Thank you for answer.

But I don’t think you need the script I’m going to give you.

Here’s what I was talking about:

  1. model.onnx
  2. tensorRT quantization + optimizer = model.trt
  3. model_quant_optimizer.onnx

I want to make an onnx file with quantization and optimizer applied using only the quantization and optimizer functions of tensorRT in the onnx file.
I don’t want the trt file.
The method of using the trt file will be used in the next project.
I am currently wondering if it is possible to create an onnx file using only the quantization and optimizer functions of tensorRT.

If the above method does not work, I wonder if it is possible to convert a trt file to an onnx file.

thank you

Hi,

Above both are not possible I think. You need to ultimately port the ONNX model to TensorRT.
In case you’re looking for training models at reduced precision,
Please refer TensorFlow-Quantization toolkit and PyTorch Quantization Toolkit as mentioned here.

Thank you.