Back-to-back detector with DeepStream 5.0

I am currently working on a project where cars and then their licence plates are detected. My previous back-to-back detection worked fine on Jetson nano with the deepstream sdk 4.0. But so far it failed to run on deepstream 5.0. However, the cars detection and licence plates detection are working without problems on deepstream 5.0. when standalone (see the figure 1 for the licence plate detection).

There were two solutions tested as back-to-back detection. The first one was based on the deepstream-test2 example deepstream 5.0, where the sgie was a detector instead of a classifier . The bounding boxes are presented on figure 2. The second solution was implemented with deepstream-test5 example of deepstream 5.0. The result is in figure 3.

These results were very similar. Licence plates tended to be on the left side of the cropped frames. It seemed that cropped frames after the first detector somehow were directed to the second detector rotated 90 degrees. Мaybe they were mixed up somewhere with the width and the height of the cropped frame inside of deepstream 5.0.

Fig. 1 Licence plate detection with deepstream 5.0:
figure1

Fig. 2 Back to back detection based on the deepstream-test2 example deepstream 5.0![|382x259]
figure2

Fig. 3 Back to back detection based on the deepstream-test5 example deepstream 5.0
figure3

The following are the config files of the back to back detection based on the deepstream-test5 example deepstream 5.0

test5_config_file_src_infer


[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=1
rows=1
columns=1
width=960
height=540
gpu-id=0
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file:///media/microsd/streams/traffic2.mp4
num-sources=1
gpu-id=0
nvbuf-memory-type=0

[source1]
enable=0
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file:///media/microsd/streams/traffic2.mp4
num-sources=2
gpu-id=0
nvbuf-memory-type=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0
nvbuf-memory-type=0

[sink1]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=UDPSink 5=nvoverlaysink 6=MsgConvBroker
type=6
msg-conv-config=dstest5_msgconv_sample_config.txt
msg-conv-payload-type=0
msg-broker-proto-lib=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_amqp_proto.so
#Optional:
msg-broker-config=/home/dlinano/mvp_its/deepstream5/deepstream-test5/configs/cfg_amqp.txt

[osd]
enable=1
gpu-id=0
display-text=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080
enable-padding=0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode = 2
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;1;1;1
bbox-border-color3=0;1;0;1
nvbuf-memory-type=0
interval=0
gie-unique-id=1

config-file=/home/dlinano/mvp_its/deepstream5/deepstream-test5/configs/config_primary_detector.txt

[tracker]
enable=0
tracker-width=640
tracker-height=352
ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so
#enable-batch-process applicable to DCF only
enable-batch-process=0

[secondary-gie0]
enable=1
gpu-id=0
network-type = 0
batch-size=16
gie-unique-id=2
operate-on-gie-id=1
#operate-on-class-ids=0;
config-file=/home/dlinano/mvp_its/deepstream5/deepstream-test5/configs/config_secondary_detector.txt

[tests]
file-loop=0

config_primary_detector


[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-engine-file=/home/dlinano/mvp_its/models/cars_4types_p2.trt
labelfile-path=/home/dlinano/mvp_its/models/car_labels.txt
force-implicit-batch-dim=1
batch-size=1
process-mode=1
model-color-format=0
network-mode=2
num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=output_cov/Sigmoid;output_bbox/BiasAdd

## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=1

## Per class configuration
[class-attrs-0]
pre-cluster-threshold=0.1
eps=0.6
minBoxes=3
detected-min-w=150
detected-min-h=150
roi-top-offset=0
roi-bottom-offset=10

[class-attrs-1]
pre-cluster-threshold=0.3
eps=0.6
minBoxes=3
detected-min-w=250
detected-min-h=250
roi-top-offset=0
roi-bottom-offset=10

[class-attrs-2]
pre-cluster-threshold=0.25
eps=0.6
minBoxes=3
detected-min-w=250
detected-min-h=250
roi-top-offset=0
roi-bottom-offset=10

[class-attrs-3]
pre-cluster-threshold=0.15
eps=0.6
minBoxes=3
detected-min-w=250
detected-min-h=250
roi-top-offset=0
roi-bottom-offset=10

config_secondary_detector


[property]
gpu-id=0
batch-size=16
gie-unique-id=2
operate-on-gie-id=1
net-scale-factor=1
model-engine-file=/home/dlinano/mvp_its/models/plate.trt
labelfile-path=/home/dlinano/mvp_its/models/plate_labels.txt

process-mode=2
#model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=1
interval=0

output-blob-names=output_cov/Sigmoid;output_bbox/BiasAdd
force-implicit-batch-dim=1
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=1
network-type = 0

[class-attrs-all]
pre-cluster-threshold=0.03
group-threshold=1
eps=0.5
minBoxes=3
detected-min-w=0
detected-min-h=0
#detected-max-w=0
#detected-max-h=0

Please share your thoughts on the issue.
It would be nice to get some replies on the following questions.

  1. Is back-to-back detection supported in deepstream 5.0 (i.e. both pgie and sgie are detectors)?
  2. If it is not supported when are you expecting to implement back-to-back detection in deepstream 5 (taking into account a lot of use cases based on this feature)?
  3. If it is supported can the deepstream-test5 example be used for back-to-back detection, or I need to build it from multiple instances of “nvinfer” element like in deepstream-test2?
  4. In case deepstream-test5 example supports the back-to-back detection, where am I wrong in my config files above?

Thank you in advance

I tried the back-to-back sample - https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/blob/master/back-to-back-detectors/README.md with below change for DeepStream 5.0, it works.
The change to back_to_back_detectors.c is for dumping the output to h264 file in Jetson platform.
You may could refer to this sample for your back-to-back implementation,

diff --git a/back-to-back-detectors/Makefile b/back-to-back-detectors/Makefile
index 7171e3b..748d9cb 100644
--- a/back-to-back-detectors/Makefile
+++ b/back-to-back-detectors/Makefile
@@ -24,7 +24,7 @@ APP:= back-to-back-detectors

 TARGET_DEVICE = $(shell gcc -dumpmachine | cut -f1 -d -)

-NVDS_VERSION:=4.0
+NVDS_VERSION:=5.0

 LIB_INSTALL_DIR?=/opt/nvidia/deepstream/deepstream-$(NVDS_VERSION)/lib/

diff --git a/back-to-back-detectors/back_to_back_detectors.c b/back-to-back-detectors/back_to_back_detectors.c
index 302b55b..3775adc 100644
--- a/back-to-back-detectors/back_to_back_detectors.c
+++ b/back-to-back-detectors/back_to_back_detectors.c
@@ -237,10 +237,15 @@ main (int argc, char *argv[])
 #ifdef PLATFORM_TEGRA
   transform = gst_element_factory_make ("nvegltransform", "nvegl-transform");
 #endif
-  sink = gst_element_factory_make ("nveglglessink", "nvvideo-renderer");
+  GstElement *parser1 = gst_element_factory_make ("h264parse", "h264-parser1");
+  GstElement *enc = gst_element_factory_make ("nvv4l2h264enc", "h264-enc");
+  GstElement *nvvidconv1 = gst_element_factory_make ("nvvideoconvert", "nvvideo-converter1");
+  //sink = gst_element_factory_make ("nveglglessink", "nvvideo-renderer");
+  sink = gst_element_factory_make ("filesink", "file-sink");
+  g_object_set (G_OBJECT (sink), "location", "./out.h264", NULL);

   if (!source || !h264parser || !decoder || !primary_detector || !secondary_detector
-      || !nvvidconv || !nvosd || !sink) {
+      || !nvvidconv || !nvosd || !enc || !sink) {
     g_printerr ("One element could not be created. Exiting.\n");
     return -1;
   }
@@ -279,11 +284,11 @@ main (int argc, char *argv[])
 #ifdef PLATFORM_TEGRA
   gst_bin_add_many (GST_BIN (pipeline),
       source, h264parser, decoder, streammux, primary_detector, secondary_detector,
-      nvvidconv, nvosd, transform, sink, NULL);
+      nvvidconv, nvosd, nvvidconv1, enc, parser1, sink, NULL);
 #else
   gst_bin_add_many (GST_BIN (pipeline),
       source, h264parser, decoder, streammux, primary_detector, secondary_detector,
-      nvvidconv, nvosd, sink, NULL);
+      nvvidconv, nvosd, enc, sink, NULL);
 #endif

   GstPad *sinkpad, *srcpad;
@@ -321,13 +326,13 @@ main (int argc, char *argv[])

 #ifdef PLATFORM_TEGRA
   if (!gst_element_link_many (streammux, primary_detector, secondary_detector,
-      nvvidconv, nvosd, transform, sink, NULL)) {
+      nvvidconv, nvosd, nvvidconv1, enc, parser1, sink, NULL)) {
     g_printerr ("Elements could not be linked: 2. Exiting.\n");
     return -1;
   }
 #else
   if (!gst_element_link_many (streammux, primary_detector, secondary_detector,
-      nvvidconv, nvosd, sink, NULL)) {
+      nvvidconv, nvosd, enc, sink, NULL)) {
     g_printerr ("Elements could not be linked: 2. Exiting.\n");
     return -1;
   }
diff --git a/back-to-back-detectors/primary_detector_config.txt b/back-to-back-detectors/primary_detector_config.txt
index ff714cb..2b6ad6e 100644
--- a/back-to-back-detectors/primary_detector_config.txt
+++ b/back-to-back-detectors/primary_detector_config.txt
@@ -60,10 +60,10 @@
 [property]
 gpu-id=0
 net-scale-factor=0.0039215697906911373
-model-file=../../../../samples/models/Primary_Detector/resnet10.caffemodel
-proto-file=../../../../samples/models/Primary_Detector/resnet10.prototxt
-labelfile-path=../../../../samples/models/Primary_Detector/labels.txt
-int8-calib-file=../../../../samples/models/Primary_Detector/cal_trt.bin
+model-file=../../../../../samples/models/Primary_Detector/resnet10.caffemodel
+proto-file=../../../../../samples/models/Primary_Detector/resnet10.prototxt
+labelfile-path=../../../../../samples/models/Primary_Detector/labels.txt
+int8-calib-file=../../../../../samples/models/Primary_Detector/cal_trt.bin
 batch-size=1
 network-mode=1
 num-detected-classes=4
diff --git a/back-to-back-detectors/secondary_detector_config.txt b/back-to-back-detectors/secondary_detector_config.txt
index 2112830..511a2a2 100644
--- a/back-to-back-detectors/secondary_detector_config.txt
+++ b/back-to-back-detectors/secondary_detector_config.txt
@@ -61,11 +61,11 @@
 gpu-id=0
 process-mode=2
 net-scale-factor=0.0039215697906911373
-model-file=../../../../samples/models/Secondary_FaceDetect/fd_lpd.caffemodel
-proto-file=../../../../samples/models/Secondary_FaceDetect/fd_lpd.prototxt
-model-engine-file=../../../../samples/models/Secondary_FaceDetect/fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine
-labelfile-path=../../../../samples/models/Secondary_FaceDetect/labels.txt
-int8-calib-file=../../../../samples/models/Secondary_FaceDetect/cal_trt.bin
+model-file=../../../../../samples/models/Secondary_FaceDetect/fd_lpd.caffemodel
+proto-file=../../../../../samples/models/Secondary_FaceDetect/fd_lpd.prototxt
+model-engine-file=../../../../../samples/models/Secondary_FaceDetect/fd_lpd_model/fd_lpd.caffemodel_b1_fp32.engine
+labelfile-path=../../../../../samples/models/Secondary_FaceDetect/labels.txt
+int8-calib-file=../../../../../samples/models/Secondary_FaceDetect/cal_trt.bin
 batch-size=1
 network-mode=0
 num-detected-classes=3

Thank you for your reply. I have managed to reproduce your results. Furthermore, my solution is working fine with the given models and configs. But it is not working with my detectnet_v2 models, which I trained with the TLT.

I have discovered this message in the terminal ERROR: [TRT]: INVALID_ARGUMENT: Cannot find binding of given name: output_bbox/BiasAdd,output_cov/Sigmoid. Despite this error my solution run without interruption but still with the results as on photo above.

The info from the terminal:


(tracker:11853): GLib-CRITICAL **: 11:15:08.447: g_strrstr: assertion 'haystack != NULL' failed
0:00:05.982337384 11853   0x557ee58c40 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::deserializeEngineAndBackend()  [UID = 2]: deserialized trt engine from :/media/microsd/mvp_its/models/plate_2.trt
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1         3x300x390       
1   OUTPUT kFLOAT output_bbox/BiasAdd 4x19x25         
2   OUTPUT kFLOAT output_cov/Sigmoid 1x19x25         

ERROR: [TRT]: INVALID_ARGUMENT: Cannot find binding of given name: output_bbox/BiasAdd,output_cov/Sigmoid
0:00:05.982560462 11853   0x557ee58c40 WARN                 nvinfer gstnvinfer.cpp:599:gst_nvinfer_logger: NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::checkBackendParams()  [UID = 2]: Could not find output layer 'output_bbox/BiasAdd,output_cov/Sigmoid' in engine
0:00:05.982592911 11853   0x557ee58c40 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::generateBackendContext()  [UID = 2]: Use deserialized engine model: /media/microsd/mvp_its/models/plate_2.trt
0:00:06.052975672 11853   0x557ee58c40 INFO                 nvinfer gstnvinfer_impl.cpp:311:notifyLoadModelStatus: [UID 2]: Load new model:secondary_detector_config.txt sucessfully
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_nvdcf.so
gstnvtracker: Optional NvMOT_RemoveStreams not implemented
gstnvtracker: Batch processing is ON
[NvDCF] Initialized
0:00:08.397019175 11853   0x557ee58c40 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend()  [UID = 1]: deserialized trt engine from :/media/microsd/mvp_its/models/4types_super_2.trt
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1         3x540x960       
1   OUTPUT kFLOAT output_bbox/BiasAdd 16x34x60        
2   OUTPUT kFLOAT output_cov/Sigmoid 4x34x60         

ERROR: [TRT]: INVALID_ARGUMENT: Cannot find binding of given name: output_bbox/BiasAdd,output_cov/Sigmoid
0:00:08.397181418 11853   0x557ee58c40 WARN                 nvinfer gstnvinfer.cpp:599:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams()  [UID = 1]: Could not find output layer 'output_bbox/BiasAdd,output_cov/Sigmoid' in engine
0:00:08.397222773 11853   0x557ee58c40 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext()  [UID = 1]: Use deserialized engine model: /media/microsd/mvp_its/models/4types_super_2.trt
0:00:08.502978245 11853   0x557ee58c40 INFO                 nvinfer gstnvinfer_impl.cpp:311:notifyLoadModelStatus: [UID 1]: Load new model:primary_detector_config.txt sucessfully
Running...

When I was converting my etlt model I used following command:


./tlt-converter /media/microsd/mvp_its/models/4types_super.etlt  -k OXY1Z3FnbDVxOGptNTQ2N2QxY2Ezb210MDg6ZDIxZjJhNGMtMzg0OS00OGRkLThjYjMtOWEzN2YxMmEyOWE3                 -o output_bbox/BiasAdd,output_cov/Sigmoid   -d 3,540,960   -i nchw   -t fp16   -e /media/microsd/mvp_its/models/4types_super_2.trt

./tlt-converter /media/microsd/mvp_its/models/plate.etlt  -k OXY1Z3FnbDVxOGptNTQ2N2QxY2Ezb210MDg6ZDIxZjJhNGMtMzg0OS00OGRkLThjYjMtOWEzN2YxMmEyOWE3                 -o output_bbox/BiasAdd,output_cov/Sigmoid   -d 3,300,390  -i nchw   -t fp16     -e /media/microsd/mvp_its/models/plate_2.trt

In my configs I am using: output-blob-names=output_bbox/BiasAdd,output_cov/Sigmoid

I believe that the source of the problem somewhere here.
Please advise

PS. If necessary I can send you my models for the check.

Is it possible to try the detectNet_v2 model we provided in https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps ?

These models are working fine.
My models are working fine when standalone. When back-to-back results are wrong.

By the way, when I got etlt i have the following message: DEBUG [/usr/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py:96] Marking [‘output_cov/Sigmoid’, ‘output_bbox/BiasAdd’] as outputs.

if you replace the DetectNet_V2 in https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps and run the sample, did you see any any error?

I got you. Let me try my models with https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps

Now back-to-back detectors are working fine including your and my solutions with all models.
The problem was only in this parameter from config file: net-scale-factor=0.0039215697906911373
It used to be 1 in my previous config for secondary detector.

I do not know why it is so important, but it is only the source of the difficulties I faced.

Thank you for your help

Glad to know it’s fixed.

Here is a typical preprocessing, net-scale-factor is the scale.

R/G/B = (R/G/B channel data - mean) * scale

hi, I want to ask how to make the secondary detector work on the full image frame instead of the detection result of the primary detector.

Hi @xiaomaozhou26
Sorry for missed your update!
Please file a new ticket instead of repling on a closed issue.

By configuring “process-mode” to 1, it will process full frame.

https://docs.nvidia.com/metropolis/deepstream/plugin-manual/index.html#page/DeepStream%20Plugins%20Development%20Guide/deepstream_plugin_details.3.01.html#wwpID0E0ZDB0HA