Unexpected FPS drop with back-to-back detector concept in deepstream-app

sharma.rahul98912 · June 8, 2021, 6:51am

• Hardware Platform Nvidia Tesla T4
• DeepStream Version 5.1

• TensorRT Version 7.2.2.3
• NVIDIA GPU Driver Version (valid for GPU only) 460.32…03
• Issue Type( questions, new requirements, bugs) question

HI,
I am using back-to-back detector with 4 video streams.

These are the FPS observations-
1) Yolo_tinyv4 based object detector as primary detector only (not any secondary gie)-
perf-

**PERF: 155.15 (154.35)	155.15 (154.35)	155.15 (154.35)	155.15 (154.35)
**PERF: 158.66 (156.92)	158.66 (156.92)	158.66 (156.92)	158.66 (156.92)
**PERF: 157.15 (156.91)	157.15 (156.91)	157.15 (156.91)	157.15 (156.91)
**PERF: 157.98 (157.19)	157.98 (157.19)	157.98 (157.19)	157.98 (157.19)
**PERF: 165.82 (159.00)	165.82 (159.00)	165.82 (159.00)	165.82 (159.00)

2) Centerface as primary detector (not any secondary gie)-
performance-

**PERF: 190.04 (189.98)	190.04 (189.98)	190.04 (189.98)	190.04 (189.98)
**PERF: 205.32 (198.29)	205.32 (198.29)	205.32 (198.29)	205.32 (198.29)
**PERF: 205.48 (201.02)	205.48 (201.02)	205.48 (201.02)	205.48 (201.02)
**PERF: 206.76 (202.28)	206.76 (202.28)	206.76 (202.28)	206.76 (202.28)
**PERF: 203.99 (202.84)	203.99 (202.84)	203.99 (202.84)	203.99 (202.84)

3) Yolo_tiny as primary detector and centerface as secondary gie.-
performance-

**PERF: 27.02 (26.32)	26.40 (25.76)	26.40 (25.76)	26.40 (25.76)
**PERF: 29.18 (27.90)	29.18 (27.61)	29.18 (27.61)	29.18 (27.61)
**PERF: 29.30 (28.68)	29.30 (28.48)	29.30 (28.48)	29.30 (28.48)
**PERF: 33.12 (29.85)	33.12 (29.68)	33.12 (29.68)	33.12 (29.68)

Detections are correct but not sure why FPS dropped significantly.

here is primary gie and secondary gie groups-

[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary.txt

[secondary-gie]
enable=1
gpu-id=0
model-engine-file=model/centerface.onnx_b4_gpu0_fp16.engine
batch-size=4
interval=0
gie-unique-id=2
nvbuf-memory-type=0
config-file=config_infer_primary_centerface.txt
operate-on-gie-id=1

Anything missing in configuration or its expected in back-to-back detector?
Also i was thinking secondary detector runs on primary gie detections not in full frame. Is that the reason for FPS drop?
Thanks.

Fiona.Chen · June 8, 2021, 7:10am

Can you upload the nvinfer config files of the two models？

sharma.rahul98912 · June 8, 2021, 7:42am

for yolo_tiny-
[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
custom-network-config=model/yolov4-tiny.cfg
model-file=model/yolov4-tiny.weights
model-engine-file=model/model_b4_gpu0_fp16.engine
labelfile-path=labels.txt
batch-size=4
network-mode=2
num-detected-classes=80
interval=0
gie-unique-id=1
#process-mode=1
network-type=0
cluster-mode=4
maintain-aspect-ratio=0
parse-bbox-func-name=NvDsInferParseYolo
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
pre-cluster-threshold=0.25

for centerface-
[property]
gpu-id=0
#net-scale-factor=0.0039215697906911373
#net-scale-factor=1
#0=RGB, 1=BGR
model-color-format=0
onnx-file=model/centerface.onnx
batch-size=4
network-mode=2
num-detected-classes=1
gie-unique-id=2
network-type=0
#output_tensor_meta=0
cluster-mode=0
#maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV4
custom-lib-path=nvdsinfer_custom_impl_centerface/libnvdsinfer_custom_impl_centerface.so
#scaling-filter=0
#scaling-compute-hw=0
#labelfile-path=labels_yolo.txt
labelfile-path=centerface_labels.txt
#process-mode=0

[class-attrs-all]
nms-iou-threshold=0.6
pre-cluster-threshold=0.4

Fiona.Chen · June 8, 2021, 9:10am

This is not back-to-back. Back-to-back needs PGIE+SGIE. You use two PGIEs.

Are you using deepstream-app? What is the deepstream-app configuration?

sharma.rahul98912 · June 8, 2021, 9:18am

Hi,
I don’t understand why this is not PGIE+SGIE.
deepstream-app_config.txt contains PGIE and SGIE.
And there are 2 different nvinfer configs for PGIE and SGIE respectively.

NOTE-
This is not back-to-back detector but deepstream-app with PGIE+SGIE. It is back-to-back detector concept. Sorry if my title confused you. Will update the title.

Please find the attached config files.
config-files.zip (2.9 KB)

Fiona.Chen · June 8, 2021, 9:42am

I’ve checked the config files, they are all PGIEs. Only “process-mode=2” indicates a SGIE, there is none in your configurations.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html#id2

sharma.rahul98912 · June 8, 2021, 9:49am

ok,
So what should i do to make deepstream-app work as PGIE+SGIE? If “process-mode” is the only criteria so i have already done the followings-

I have putted “process-mode=2” in [secondary-gie] in deepstream_app_config.txt.
It showed “process-mode” not recognised.

I have checked both “process-mode=1” and “process-model=2” in [property] in config_infer_primary_centerface.txt.
Then same FPS output is there.

So i am not sure what should i configure and where?

Fiona.Chen · June 8, 2021, 9:59am

Please config “process-model=2” in config_infer_primary_centerface.txt.

In PGIE + SGIE mode, SGIE rely on the output from PGIE. FPS is decided by the whole performance of the pipeline but not simple PGIE speed and SGIE speed.

If you want to find bottleneck of the pipeline, you may test the component latency according to the DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums.

sharma.rahul98912 · June 8, 2021, 9:59am

Also if i sum up the things.

If this is working as 2 PGIEs, then do you think the FPS mentioned on the top is correct?
Like if i run models independently , then FPS is above 150 and if i run as 2 PGIEs, then fps drops to 30?

Fiona.Chen · June 8, 2021, 10:01am

It is hard to say it is correct or not. The application works in asynchronized way, it is not the simple linear relationship.

sharma.rahul98912 · June 8, 2021, 10:03am

correct me if i am wrong.
Its process-mode=2 right ? not process-model.

Also i have tested with this. output FPS is still 30.

Fiona.Chen · June 8, 2021, 10:04am

Yes. It is “process-mode”

sharma.rahul98912 · June 8, 2021, 10:08am

Ok.

So if i conclude, the pipeline runs like-
prepare image-> do primary inference-> do tracking-> prepare image for secondary gie-> do inference-> mux both → show.
If it so, then i can conclude FPS may drop.

Because if it is concurrent, then there is hardly 2~3 ms of delay is added by secondary detector which does not effect much.

I tried-

export NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1
export NVDS_ENABLE_LATENCY_MEASUREMENT=1

but log is-
Batch meta not found for buffer 0x7f33ec0655a0
BATCH-NUM = 340**
Batch meta not found for buffer 0x7f33d80393f0
BATCH-NUM = 341**
Batch meta not found for buffer 0x7f33d80391d0
BATCH-NUM = 342**
Batch meta not found for buffer 0x7f33ec0655a0
…

sharma.rahul98912 · June 8, 2021, 10:18am

Ok i am able to see-
The logs are-

BATCH-NUM = 124**
Comp name = nvv4l2decoder3 in_system_timestamp = 1623147424852.312012 out_system_timestamp = 1623147425008.355957 component latency= 156.043945
Comp name = src_bin_muxer source_id = 0 pad_index = 0 frame_num = 124 in_system_timestamp = 1623147425008.459961 out_system_timestamp = 1623147425033.969971 component_latency = 25.510010
Comp name = nvv4l2decoder1 in_system_timestamp = 1623147424852.294922 out_system_timestamp = 1623147425008.497070 component latency= 156.202148
Comp name = src_bin_muxer source_id = 1 pad_index = 1 frame_num = 124 in_system_timestamp = 1623147425008.537109 out_system_timestamp = 1623147425033.969971 component_latency = 25.432861
Comp name = nvv4l2decoder0 in_system_timestamp = 1623147424852.372070 out_system_timestamp = 1623147425008.332031 component latency= 155.959961
Comp name = src_bin_muxer source_id = 2 pad_index = 2 frame_num = 124 in_system_timestamp = 1623147425008.378906 out_system_timestamp = 1623147425033.969971 component_latency = 25.591064
Comp name = nvv4l2decoder2 in_system_timestamp = 1623147424852.584961 out_system_timestamp = 1623147425008.678955 component latency= 156.093994
Comp name = src_bin_muxer source_id = 3 pad_index = 3 frame_num = 124 in_system_timestamp = 1623147425008.724121 out_system_timestamp = 1623147425033.970947 component_latency = 25.246826
Comp name = primary_gie in_system_timestamp = 1623147425034.010010 out_system_timestamp = 1623147425050.318115 component latency= 16.308105
Comp name = secondary_gie_0 in_system_timestamp = 1623147425089.768066 out_system_timestamp = 1623147425155.408936 component latency= 65.640869
Comp name = tiled_display_tiler in_system_timestamp = 1623147425155.537109 out_system_timestamp = 1623147425173.820068 component latency= 18.282959
Comp name = osd_conv in_system_timestamp = 1623147425173.997070 out_system_timestamp = 1623147425175.420898 component latency= 1.423828
Comp name = nvosd0 in_system_timestamp = 1623147425175.499023 out_system_timestamp = 1623147425179.126953 component latency= 3.627930
Source id = 0 Frame_num = 124 Frame latency = 326.924072 (ms)
Source id = 1 Frame_num = 124 Frame latency = 326.941162 (ms)
Source id = 2 Frame_num = 124 Frame latency = 326.864014 (ms)
Source id = 3 Frame_num = 124 Frame latency = 326.651123 (ms)

Fiona.Chen · June 9, 2021, 2:38am

Seems the buffers stay in nvv4l2decoder component most. You may try the method in Troubleshooting — DeepStream 6.1.1 Release documentation to improve the performance.

Topic		Replies	Views
Back-to-back detector with DeepStream 5.0 DeepStream SDK	11	2233	October 12, 2021
Integrate back to back detectors with deep stream test5 application DeepStream SDK	11	543	July 18, 2023
Secondray gie for full frame DeepStream SDK	9	1411	October 12, 2021
Detector1 --> cropped images --> detector 2 Application cascading in the latest back-to-back DeepStream SDK nvbugs	21	1606	October 12, 2021
issue on running back-to-back detector DeepStream SDK	10	919	October 12, 2021
Secondary detector operates only on detected frames. DeepStream SDK	5	848	October 12, 2021
How to use secondary detector with deepstream 5.0 python apps DeepStream SDK jetpack , jetson-inference , linux	8	1034	October 12, 2021
Low fps in back to back detector DeepStream SDK jetson , deepstream	7	103	March 7, 2025
May DeepStarem load multiple primary-gie or use secondary-gie0 for origin source? DeepStream SDK	15	2813	October 12, 2021
Some details wanted about Secondary gie mode ? DeepStream SDK	5	975	April 27, 2020

Unexpected FPS drop with back-to-back detector concept in deepstream-app

Related topics