Run deepstream python app test1, performances are very similar when using int8 and fp32

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) 1080Ti
• DeepStream Version 5.0
• JetPack Version (valid for Jetson only) No
• TensorRT Version 7.0
• NVIDIA GPU Driver Version (valid for GPU only) 450
• Issue Type( questions, new requirements, bugs) question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I’m trying the deepstream python app, deepstream_test1 demo.
When setting the network-mode to 0 and 1 as following:
network-mode=0 ##0=FP32, 1=INT8, 2=FP16 mode
network-mode=1 ##0=FP32, 1=INT8, 2=FP16 mode
The performances are very similar, in terms of GPU usage/gpu load. The FPS is almost the same when running through the same video (FPS is counted as total_frames/total_time).
when runing with network-mode=0

when runing with network-mode=1:


the FPS are the same.

The other config settings:
labelfile-path=…/…/…/…/samples/models/Primary_Detector/labels.txt
int8-calib-file=…/…/…/…/samples/models/Primary_Detector/cal_trt.bin
force-implicit-batch-dim=1
batch-size=1
network-mode=1 ##0=FP32, 1=INT8, 2=FP16 mode
num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid
#scaling-filter=0
#scaling-compute-hw=0
[class-attrs-all]
pre-cluster-threshold=0.2
eps=0.2
group-threshold=1

Is it the supposed performance? Or something is wrong in my side?

Any ideas?

It may be due to FP32 and INT8 kernel implementation. you may find TensorRT documentation about this.