Why is' int8 'not as fast as' fp16'


Hey, I did some tests on Pytorch and TensorRT with yolov4. why is INT8 not as fast as FP16?This is my result.

model type tensorrt speed(ms) pytorch speed(ms) max error
YOLOv4 fp32 1.7709794 34.288859 0.0000018239
YOLOv4 fp16 1.5252711 34.288859 0.0045694825
YOLOv4 int8 1.6355145 34.288859 0.6155709


TensorRT Version:
GPU Type: GTX1060ti
Nvidia Driver Version: 460.32.03
CUDA Version: 10.2
CUDNN Version: 8.0.2
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.5
PyTorch Version (if applicable): 1.4.0

Hi @997911043,

Could you please provide us more info and reproduce scripts/model, so we can assist better.

Thank you.