Yolov3 INT8 performance same with FP16

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): T4
• DeepStream Version: 5.0
• JetPack Version (valid for Jetson only)
• TensorRT Version : 7.1
• NVIDIA GPU Driver Version (valid for GPU only): 440.64.00

I ran yolov3(416x416) + 3*ResNet18 (224x224) with Deepstream 5.0 on T4.

I use the config files in /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo

The input source is sample_1080p_h264.mp4 in Deepstream.
I make it as the source of 16 streams as MultiURI.

I found the performance (either INT8 or FP16) is almost the same: 16ch @ 10fps

It surprised me because I assume INT8 should have significantly higher performance than FP16.

The config file is attached. 1080p_dec_yolov3_int8_3xresnet18_int8_tracker.txt (5.1 KB)

Could anyone let me know if I am missing something?

It’s not fair to compare INT8 FP16 performance in deepstream pipeline, it have many other plugins, which will have affection, you can use tensorrt sample trtexec to profile the performance for INT8 and FP16.

and you can see this for T4 performance running deepstream sample,