YOLOv2 performance Issue based on streammux output size

Hardware Platform (Jetson / GPU) - GPU GT 1030 and OS Ubuntu-18.04.4 LTS

DeepStream Version - DeepStream 5.0

TensorRT Version - 7.2.2.3

CUDA - 11.2

NVIDIA GPU Driver Version (valid for GPU only) - 460.32.03

Issue Type( questions, new requirements, bugs) -
The performance of the YOLOv2 model using Deepstream changes when the output size of the streammux changes. We ran a series of tests to confirm this and the model performs better when the streammux output size is given as a square resolution compared to that of a rectangle resolution or the default size(1920x1080 or 1280x720). Although by using the default resolution, the model is still able to detect the objects, but after changing the resolution to a square one, the model was able to detect more objects which weren’t detected earlier. Can you state the reason behind this and what can be done to overcome this performance issue without sacrificing the resolution of the streammux output?
I have attached the config files required to run the deepstream application as well as our results.

Deepstream YOLOv2 Config File:
deepstream_app_config_yoloV2.txt (3.6 KB)

YOLOv2 Config File:
config_infer_primary_yoloV2.txt (3.4 KB)

The difference in Results:
Square Resolution:

Rectangle (Default) Resolution:

As you can see with the square resolution, the model was able to detect cars from the far away lane too.

How to reproduce the issue? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

Follow the steps from the README file at
"/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/ "
and run the application for YOLOv2.
README file:
README (5.2 KB)

This application uses the default weights and config for YOLOv2 application provided by Deepstream.
You can change the Streammux resolution from the deepstream_app_config_yoloV2.txt file to square and rectangle resolution and observe the results.

Hi @amrith.c ,
Could you take a try below change to remove “maintain-aspect-ratio=1” ?

diff --git a/config_infer_primary_yoloV2.txt b/config_infer_primary_yoloV2.txt
index 5c96bfc..06765d9 100644
--- a/config_infer_primary_yoloV2.txt
+++ b/config_infer_primary_yoloV2.txt
@@ -74,7 +74,7 @@ network-type=0
 is-classifier=0
 ## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
 cluster-mode=2
-maintain-aspect-ratio=1
+#maintain-aspect-ratio=1
 parse-bbox-func-name=NvDsInferParseCustomYoloV2
 custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
 engine-create-func-name=NvDsInferYoloCudaEngineGet

Thank you, this is working as of now. I have another case.
If I have two input videos with a resolution (different aspect ratios) of 640x480 and 1280x720, what will be the best streammux resolution that I can use to get the best inference and the result. Should I keep “maintain-aspect-ratio=0” here too or should I make anyother changes?

I think that depends on how your model was trained.
Inference should use the consistent pre-processing method, e.g. if a model was trained with maintain-aspect-ratio, then using maintain-aspect-ratio should get better accuracy.

Doubts cleared. Thanks for the reply.