I would like to get the TOP1 accuracy by doing quatization with INT8 calibration to a ONNX model using validating images.jpeg with TensorRT C++ API
Environment
TensorRT Version: 8.4 EA GPU Type: Jetson AGX ORIN Nvidia Driver Version: CUDA Version: 11.4 CUDNN Version: 8.3.2.49 Operating System + Version: Ubuntu 20.04 LTS Python Version (if applicable): 3.8 TensorFlow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if container which image + tag): Baremetal on Jetson AGX Orin
Hello, please I would like to have a very general sample how to use a ONNX model to calibrated in INT8 and run inference to get the TOP1 accuracy. I already looked in theses samples below but nothing is clear and they have different implementation each:
SampleINT8
SampleINT8API
SampleOnnxMNIST
But I have images under jpeg format not as the MNIST format so I can’t use the MNISTBatchStream class like in the samples.
So could you please provide me an example or if you don’t have any example please provide me something clear where I can use it to solve my problem.
Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
check_model.py
import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!
Thank you for your reply but I already saw theses and for the ONNX model, it is not that important because I need something general but I will share with you a model, please have a look below: D_resnet18-v1-7.onnx (44.7 MB)
For trtexec, I already use it but it does not show me TOP1 and TOP5 accuracy so for me it is not a solution.
NOTE: I am using the JetPack 5.0.1 DP wich has TensorRT 8.4 EA.
No, there’s no “built-in” way to do this. The user can do it themselves in multiple ways - either having a CPU-side top-K calculation with the outputs of the engine, modifying the parsed network itself and adding a TRT TopK layer through the TRT API, editing the ONNX graph itself using ONNX-Graphsurgeon to add the topk node, etc.