hello.
I would like to use your software(tensorRT) skills.
I have a problem I’m trying to solve now, but it doesn’t work.
I have a question.
It was judged that your software technology has the function of reducing model weight and compiling.
Currently, there is no need for a function to compile, only a technique for reducing the weight of the model is required.
A function to compile later is required.
That’s why I want to use only model lightweighting technology.
I wonder if there is a way to lighten the model.onnx model using your software and get the model_quant.onnx model back.
So far, no matter how much I search, I don’t think that feature is available.
thanks in advance for reply
Environment
TensorRT Version: 8.4 GPU Type: A6000 Nvidia Driver Version: 510 CUDA Version: 11.4 CUDNN Version: 8 Operating System + Version: Python Version (if applicable): 3.8 TensorFlow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if container which image + tag):
Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
check_model.py
import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!
But I don’t think you need the script I’m going to give you.
Here’s what I was talking about:
model.onnx
tensorRT quantization + optimizer = model.trt
model_quant_optimizer.onnx
I want to make an onnx file with quantization and optimizer applied using only the quantization and optimizer functions of tensorRT in the onnx file.
I don’t want the trt file.
The method of using the trt file will be used in the next project.
I am currently wondering if it is possible to create an onnx file using only the quantization and optimizer functions of tensorRT.
If the above method does not work, I wonder if it is possible to convert a trt file to an onnx file.
Above both are not possible I think. You need to ultimately port the ONNX model to TensorRT.
In case you’re looking for training models at reduced precision,
Please refer TensorFlow-Quantization toolkit and PyTorch Quantization Toolkit as mentioned here.