Deepstream doesn't give expected Mask-RCNN output

@AastaLLL I got past this error which was actually due to rebuilding TensorRT and the pre-installed TensorRT that came with JetPack 4.5.1 worked out fine. I re-did all the above steps and got the following successful inference.

ds-tao-segmentation -c ~/Desktop/optimisation/deepstream_tao_apps/customConfigs/custom_config.txt -i ~/Desktop/optimisation/large.jpg 
Now playing: /home/virus/Desktop/optimisation/deepstream_tao_apps/customConfigs/custom_config.txt
Opening in BLOCKING MODE
Opening in BLOCKING MODE 
0:00:03.762353419 10172   0x55905e4040 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1702> [UID = 1]: deserialized trt engine from :/home/virus/Desktop/optimisation/res101-holygrail-ep26-fp16.engine
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_image     3x1024x1024     
1   OUTPUT kFLOAT mrcnn_detection 100x6           
2   OUTPUT kFLOAT mrcnn_mask/Sigmoid 100x4x28x28     

0:00:03.762579860 10172   0x55905e4040 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1806> [UID = 1]: Use deserialized engine model: /home/virus/Desktop/optimisation/res101-holygrail-ep26-fp16.engine
0:00:03.899089690 10172   0x55905e4040 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:/home/virus/Desktop/optimisation/deepstream_tao_apps/customConfigs/custom_config.txt sucessfully
Running...
NvMMLiteBlockCreate : Block : BlockType = 256 
[JPEG Decode] BeginSequence Display WidthxHeight 1024x1024
in videoconvert caps = video/x-raw(memory:NVMM), format=(string)RGBA, framerate=(fraction)1/1, width=(int)1280, height=(int)720
End of stream
Returned, stopping playback
[JPEG Decode] NvMMLiteJPEGDecBlockPrivateClose done
[JPEG Decode] NvMMLiteJPEGDecBlockClose done
Deleting pipeline

But the mask saved is all black. My config file is as follows:


[property]
gpu-id=0
net-scale-factor=0.007843

# Since the model input channel is 3, using RGB color format.

model-color-format=0
offsets=127.5;127.5;127.5
labelfile-path=./custom_labels.txt

##Replace following path to your model file

model-engine-file=/home/virus/Desktop/optimisation/res101-holygrail-ep26-fp16.engine

#DS5.x cannot parse onnx etlt model, so you need to
#convert the etlt model to TensoRT engine first use tao-convert

tlt-encoded-model=../../models/peopleSemSegNet/peoplesemsegnet.etlt
tlt-model-key=tlt_encode

infer-dims=3;1024;1024 
##3;544;960
batch-size=1

## 0=FP32, 1=INT8, 2=FP16 mode

network-mode=2
num-detected-classes=4
interval=0
gie-unique-id=1
network-type=2
output-blob-names=mrcnn_mask/Sigmoid
segmentation-threshold=0.0

##specify the output tensor order, 0(default value) for CHW and 1 for HWC

segmentation-output-order=1

[class-attrs-all]
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Note that I have no tlt model and simply leave the

tlt-encoded-model=../../models/peopleSemSegNet/peoplesemsegnet.etlt
tlt-model-key=tlt_encode

part as initially present.

What am I missing ? How can I get the mask ?

Hi,

Would you mind sharing your custom source/model with us?
So we can reproduce this issue in our environment for checking.

Thanks.

Here is the UFF model : res101-holygrail-ep26.uff - Google Drive

Thanks

Hi,

The mask are all black indicates that there are nothing detected.

Just check your configuration, could you valid if net-scale-factor and offsets are correct?
It is common that an incorrect data range leads to the detector malfunction.

...
net-scale-factor=0.007843

# Since the model input channel is 3, using RGB color format.

model-color-format=0
offsets=127.5;127.5;127.5
...

Thanks.

I can’t find the way to calculate net-scale-factor and offsets as they are not a part of Mask-RCNN and rather added by NVIDIA.

I did find a related resource: Training Instance Segmentation Models Using Mask R-CNN on the NVIDIA TAO Toolkit | NVIDIA Technical Blog where they do have a similar config file. Using its parameter makes no difference to the earlier results.

@ChrisDing did share the Deepstrea4.0 samples to tackle this here converting mask rcnn to tensor rt - #31 by ChrisDing

But I see its outdated.

Hi,

May I know how do you train your model first?
Do you train it with TLT (TAO) or other frames like TensorFlow?

We test the PeopleSegNet example and it can work correctly.

Thanks.

I trained it using the default method as prescribed using Matterport’s method and not using TLT

Hi,

Could you check if you can get a correct output with TensorRT first?
This will help us to narrow down the issue is from TensorRT or Deepstream.

Thanks.

TensorRT gives desired output as I perform them in this colab notebook

I use the sample_uff_maskRCNN of TRT 7.0. I have tested this on host as well as the Jetson Xavier.

Hi,

Thanks for all the confirm and testing.

we are going to reproduce this issue internally.
Will get back to you later.

1 Like

Hi,

Thanks for your patience.

Please use a specified example for MaskRCNN instead.
Confirmed that we can get the mask output with the res101-holygrail-ep26.uff model.

1. Get source

$ export DS_SRC_PATH=/opt/nvidia/deepstream/deepstream-6.0/
$ git clone https://github.com/NVIDIA-AI-IOT/deepstream_4.x_apps.git

2. Apply change

diff --git a/Makefile b/Makefile
index 80e6502..f11bfac 100644
--- a/Makefile
+++ b/Makefile
@@ -13,7 +13,7 @@ APP:= deepstream-custom
 
 TARGET_DEVICE = $(shell gcc -dumpmachine | cut -f1 -d -)
 
-NVDS_VERSION:=4.0
+NVDS_VERSION:=6.0
 
 LIB_INSTALL_DIR?=/opt/nvidia/deepstream/deepstream-$(NVDS_VERSION)/lib/
 
diff --git a/nvdsinfer_customparser_mrcnn_uff/nvdsinfer_custombboxparser_mrcnn_uff.cpp b/nvdsinfer_customparser_mrcnn_uff/nvdsinfer_custombboxparser_mrcnn_uff.cpp
index d8ac0d4..90ceab6 100644
--- a/nvdsinfer_customparser_mrcnn_uff/nvdsinfer_custombboxparser_mrcnn_uff.cpp
+++ b/nvdsinfer_customparser_mrcnn_uff/nvdsinfer_custombboxparser_mrcnn_uff.cpp
@@ -28,7 +28,7 @@ static const int DETECTION_MAX_INSTANCES = 100;
 static const int NUM_CLASSES = 1 + 80; // COCO has 80 classes
 
 static const int MASK_POOL_SIZE = 14;
-static const nvinfer1::DimsCHW INPUT_SHAPE{3, 1024, 1024};
+static const nvinfer1::Dims3 INPUT_SHAPE{3, 1024, 1024};
 //static const Dims2 MODEL_DETECTION_SHAPE{DETECTION_MAX_INSTANCES, 6};
 //static const Dims4 MODEL_MASK_SHAPE{DETECTION_MAX_INSTANCES, NUM_CLASSES, 28, 28};
 
diff --git a/pgie_mrcnn_uff_config.txt b/pgie_mrcnn_uff_config.txt
index b169d1d..5422121 100644
--- a/pgie_mrcnn_uff_config.txt
+++ b/pgie_mrcnn_uff_config.txt
@@ -50,7 +50,7 @@ offsets=103.939;116.779;123.68
 model-color-format=1
 labelfile-path=./nvdsinfer_customparser_mrcnn_uff/mrcnn_labels.txt
 uff-file=./mrcnn_nchw.uff
-model-engine-file=./mrcnn_nchw.uff_b1_fp32.engine
+model-engine-file=./mrcnn_nchw.uff_b1_gpu0_fp32.engine
 uff-input-dims=3;1024;1024;0
 uff-input-blob-name=input_image
 batch-size=1

3. Compile and Run

$ cd deepstream_4.x_apps/nvdsinfer_customparser_mrcnn_uff/
$ CUDA_VER=10.2 make
$ cd ../
$ make
$ cp {res101-holygrail-ep26.uff} mrcnn_nchw.uff
$ ./deepstream-custom pgie_mrcnn_uff_config.txt /opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_720p.h264 

Thanks.

Is it possible to use an image instead of the h264 video file ?

Hi,

Image has a different decoder compared to the video.
You can find an example in the below folder:

/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-image-decode-test

Thanks.

Hi,
I don’t understand why do you rename the uff file. Does this sample automatically search for mrcnn_nchw.uff and create an engine file out of it ?
cp {res101-holygrail-ep26.uff} mrcnn_nchw.uff

even though we explicitly mention our engine +model-engine-file=./mrcnn_nchw.uff_b1_gpu0_fp32.engine

Hi,

The configure file reads the mrcnn_nchw.uff file.
You can also update the configure file to the corresponding uff file path.

pgie_mrcnn_uff_config.txt

[property]
...
uff-file=./mrcnn_nchw.uff

Setting up the uff path can force Deepstream to convert the uff file into TensorRT engine to avoid compatibility issues.

Thanks.

Hi, thanks for all the helpful replies.

Can you also tell me how do the different components of the application

  1. deepstream_custom.c or
    deepstream_image_decode_app.c
  2. NvDsInferParseCustomMrcnnUff (defined in nvdsinfer_custombboxparser_mrcnn_uff.cpp) and
  3. deepstream/gstreamer)

Stitch together ?

I can see the custom_parser compiled in the MAKEFILE provided within the directory and it replaces the default parser by specifying in pgie_mrcnn_uff_config.txt. But I fail to find any reference to the custom parser being used post-inference, defined anywhere in the deepstream_custom.c.

Please help me understand.

Ok so after studying GStreamer, I understand that the custom parser is simply an element in the entire pipeline, created by parsing the pgie config file. The following snapshot explains how it connects to the rest of the pipeline.
Screenshot 2022-01-17 at 3.09.48 PM

Hey @AastaLLL sorry to bring this up again even though this issue is resolved.

Just wanted to confirm if during your test, were you able to overlay the mask on the result ? In my case, all I get it the bounding box. I am aware that there are methods to save the mask from within the parser’s script but is there any other way that you used ?

Suggest to open a new topic for oher issues. Thanks