Cannot run non Square YOLO network using objectDetector_Yolo on jetson nano (Tensorrt nvinfer1::IPoolingLayer has a bug)

I have tested tiny-yolov3 on jetson nano using deepstream and it works fine.

Now i have a modified yolo model whose input is not square (because in my scenerio non-square grid size gives better results), when i tried to load this model to convert it in deepstream to tensorrt it got errors because the sample provided only supports symmetric input.

So i made changes in nvdsinfer_custom_impl_Yolo code inside objectDetector_Yolo and i was able to load and run the network.
Changes i made were removal of some size check asserts, calculation and use of gridSize and stride in both H and W dimension and some others.

But the network produced incorrect output and in cases did not produce any output.
What i noticed was that the network info printed by darknet and nvdsinfer differed, after the first maxpool the output dimensions printed by nvdsinfer were same in size (which should not be happening)

Darknet output: https://imgur.com/a/evM8ABl

NvdsInfer output: https://imgur.com/a/H41W1a8

I tried debugging and found that the dimensions returned by nvinder1::IPollingLayer in netAddMaxpool in trt_utils.cpp are wrong (output dimensions are same in size even though input in asymmetric).

my config file:

[net]

# Testing
# batch=1
# subdivisions=1

# Training
batch=64
subdivisions=8

height=128
width=288
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
#burn_in=1200
max_batches = 25000
policy=steps
steps=2000,6000,9000,22000
scales=.1,.1,.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=96
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=160
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=64
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=192
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=12
activation=linear

[yolo]
iou_loss= giou
mask = 0,1
anchors = 33,35,  22,53
classes=1
num=2
jitter=.3

random=0

Hi,

Could you share us a simple reproducible source so we can check this issue in our environment?

Thanks.

Hi AastaLLL

please open the below link to download the modified files of nvdsinfer_custom_impl_Yolo
https://dropmefiles.com/FPggu

Hi AastaLLL

Did you check the sources i uploaded ?

Hi,

Sorry for the late update.

We can compile your source in our environment.
Would you mind to share your customized model with us so we can reproduce the issue directly?

Thanks.

Hi AastaLLL,

link to model cfg file: https://dropmefiles.com/mwWOT

link to model weights file: https://dropmefiles.com/Lu108

darknet output: https://imgur.com/a/evM8ABl
NvdsInfer output: https://imgur.com/a/H41W1a8

Hi,

We cannot download the weight file shared above.
Could you help us to verify the link?

Thanks.

Hi AastaLLL

here is updated link to model weights file: https://drive.google.com/file/d/1XY4uQb1ycQroWruoXkP3HO3yTq1neXeC/view?usp=sharing

Hi,

Could you also share the deepstream pipeline and model configure with us?
We try to reproduce this issue in our side but keep meeting this error:

ERROR: yoloV3 output layer.size: 1 does not match mask.size: 3
0:00:07.048731940 29715     0x37eda720 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:fillDetectionOutput(): Failed to parse bboxes using custom parse function
Segmentation fault (core dumped)

Thanks.

Hi AastaLLL,

I am attaching zip containing model config, model weights and pipeline files : https://dropmefiles.com/omoPl

please let me know if you require any other information.

Hi,

Thanks, we can reproduce this issue now.
Will get back to you asap.

Hi AastaLLL

What’s the status on this issue ?

Hi,

Sorry for keeping you waiting.
We still need more time to check this issue.

Thanks.

Hi AastaLLL

Any updates ??

Hi,

Sorry to keep you waiting.

We are still checking this issue since this might be related to the TensorRT implementation rather than just deepstream.
Thanks for your patience.

Hi,

The root cause is that there is a manually function to decide the output dimension of pooling layer.
A non-square handling need to be applied in this function too.

--- a/nvdsinfer_custom_impl_Yolo/trt_utils.h
+++ b/nvdsinfer_custom_impl_Yolo/trt_utils.h
@@ -51,18 +51,21 @@ private:
         //assert(stride.d[0] == stride.d[1]);
         //assert(padding.d[0] == padding.d[1]);
 
-        int outputDim;
+        int outputDimH;
+        int outputDimW;
         // Only layer maxpool_12 makes use of same padding
         if (m_SamePaddingLayers.find(layerName) != m_SamePaddingLayers.end())
         {
-            outputDim = (inputDims.d[0] + 2 * padding.d[0]) / stride.d[0];
+            outputDimH = (inputDims.d[0] + 2 * padding.d[0]) / stride.d[0];
+            outputDimW = (inputDims.d[1] + 2 * padding.d[1]) / stride.d[1];
         }
         // Valid Padding
         else
         {
-            outputDim = (inputDims.d[0] - kernelSize.d[0]) / stride.d[0] + 1;
+            outputDimH = (inputDims.d[0] - kernelSize.d[0]) / stride.d[0] + 1;
+            outputDimW = (inputDims.d[1] - kernelSize.d[1]) / stride.d[1] + 1;
         }
-        return nvinfer1::DimsHW{outputDim, outputDim};
+        return nvinfer1::DimsHW{outputDimH, outputDimW};
     }

We also found a user who implements the non-square YOLO change, which can also give you some information.
https://devtalk.nvidia.com/default/topic/1066725/deepstream-sdk/trouble-in-converting-non-square-grid-in-yolo-network-to-tensorrt-via-deepstream/post/5415091/#5415091

Thanks.