Why is' int8 'not as fast as' fp16'

997911043 · January 30, 2021, 1:43am

Description

Hey, I did some tests on Pytorch and TensorRT with yolov4. why is INT8 not as fast as FP16?This is my result.

model	type	tensorrt speed(ms)	pytorch speed(ms)	max error
YOLOv4	fp32	1.7709794	34.288859	0.0000018239
YOLOv4	fp16	1.5252711	34.288859	0.0045694825
YOLOv4	int8	1.6355145	34.288859	0.6155709

Environment

TensorRT Version: 7.2.1.6
GPU Type: GTX1060ti
Nvidia Driver Version: 460.32.03
CUDA Version: 10.2
CUDNN Version: 8.0.2
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.5
PyTorch Version (if applicable): 1.4.0

spolisetty · February 1, 2021, 5:13am

Hi @997911043,

Could you please provide us more info and reproduce scripts/model, so we can assist better.

Thank you.