Yolov3 int8 on tensorrt 7.1.0.16

Hi, I have coverted the darknet yolov3 model to both fp16 tensorrt model and int8 tensorrt model with coco dataset as calibration data . However, the size of the two generated trt models (yolov3-fp16.trt & yolov3-int8.trt) is very similar and the inference speed is also quite the same. I am on a Jetson NX2, TensorRT 7.1.0.16, CUDA 10.2.

Please find my code in the attached files.
calibrator.rtf (6.3 KB) onnx_to_tensorrt_int8.rtf (13.0 KB)

Hi,

Have you check the performance of fp16 and int8 model?
Would you mind to share the profiling result with us first?

Thanks.

Attached are the inference results of the fp16 and int8 model. They both look reasonable to me.


Hi,

Please noticed that the de-serialized file contains TensorRT implementation and information.
It doesn’t guarantee that INT8 will generate a smaller file than FP16.
But you will get a much less memory usage and a much better performance when using INT8.

Thanks.