I want to convert my onnx model to a TRT engine using int8/“best” precision. My input format is fp16. So far I was able to use the trtexec command with --inputIOFormats=fp16:chw and --fp16 to get the correct predictions. I want to speed up inference using the “best” mode, but I’m getting wrong predictions. I read that I need to do post-calibration when using int8. I found python examples to do so, but how can I also specify that the input is fp16? This is required for my application. Is there a tensorrt.BuilderFlag that I can use?
hi there i want to convert yolov8’s predefined weights to int8. is that possible. if it is can you guide me through the process.
ps hope ur issue is resolved soon
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| RT-DETR conversion to int8 | 1 | 346 | October 23, 2024 | |
| Deepstream -Jetson Xavier NX - Onnx2trt | 6 | 711 | October 12, 2021 | |
| Data inferencing to INT8U quantized model | 2 | 498 | October 12, 2021 | |
| I would like to ask how to get a int8 model? | 9 | 888 | September 27, 2021 | |
| Acceleration with INT8 precision using TensorRT | 6 | 999 | February 13, 2021 | |
| TensorRT 8.0.3 imagenet resnet model INT8 conversion identical output with different input after calibration | 3 | 1317 | December 23, 2021 | |
| TensorRT | 1 | 381 | October 27, 2021 | |
| TensorRT INT8 calibration in C++ api | 2 | 1983 | February 14, 2022 | |
| low inference latency for INT8, comaped to FP32, FP16 using Tensorflow 1.13 and TensorRT 5.1.2 | 1 | 1038 | January 24, 2020 | |
| Int8 get the same result, but in FP16 the result is correct | 1 | 453 | December 1, 2021 |