TRT inference with int8 calibration and specific inputIOFormats

raphaela.kreiser · February 12, 2024, 10:31am

I want to convert my onnx model to a TRT engine using int8/“best” precision. My input format is fp16. So far I was able to use the trtexec command with --inputIOFormats=fp16:chw and --fp16 to get the correct predictions. I want to speed up inference using the “best” mode, but I’m getting wrong predictions. I read that I need to do post-calibration when using int8. I found python examples to do so, but how can I also specify that the input is fp16? This is required for my application. Is there a tensorrt.BuilderFlag that I can use?

syedmusakazmi99 · February 13, 2024, 8:31am

hi there i want to convert yolov8’s predefined weights to int8. is that possible. if it is can you guide me through the process.
ps hope ur issue is resolved soon

Topic		Replies	Views
How to apply int8 quantization to Transformer on Xavier Jetson Xavier NX tensorrt	2	492	August 12, 2022
I would like to ask how to get a int8 model? Jetson Xavier NX tensorrt	9	728	September 27, 2021
TRT Engin in INT8 is much slower than FP16 TensorRT	4	1962	November 11, 2021
RT-DETR conversion to int8 TensorRT	1	197	October 23, 2024
Object Detection Inference Optimisation Jetson Xavier NX jetson-inference	4	634	April 17, 2023
TRT Uses INT 32 VS INT 16 TensorRT	3	1027	October 12, 2021
TensorRT5.0.x INT8 for Onnx? General	2	971	October 12, 2021
Tensorrt Conversion TensorRT	2	88	November 30, 2024
Deepstream -Jetson Xavier NX - Onnx2trt DeepStream SDK	6	621	October 12, 2021
Can convert to INT32 but not with FP16 TensorRT	3	1056	November 29, 2022

TRT inference with int8 calibration and specific inputIOFormats

Related topics