Problem On Deploying Mrcnn Model in TX2

Hi

I have trained the default MaskRCNN model From Nvidia-TLT notebook of Coco2017 data-set.

The model Trained and got converted into an “etlt” file without any problems and also had deployed in deepstream installed in a docker container on a PC having 2080Ti.

But i have some problems while deploying it in TX2.
Note: i have followd the deployment steps from the following nvidia dev blog

I have installed the latest Jetpack 4.5.1 coupled with Nvidia Deepstream 5.1 and also have built the TRT-OSS plugins and replaced the trt lib file.

While deploying the model file in deepstream the model gets loaded and converted to engine file after that the video then plays without any Segmentation in it.

I have rebuilt the TRT OSS again and followed the same procedure to deploy and didn’t get any colored segments in the output video.
The video was playing without any segments in it.

• Hardware Platform: TX2
• DeepStream: 5.1
• JetPack Version: 4.5.1
• TensorRT Version: 7
• Issue Type: Deployment of Mrcnn Model trained in NVIDIA TLT

The files and configs used are been added to this Drive Link to reproduce the same issue

Hi,

Do you use the same configure file on desktop but get the correct result?
If yes, would you mind to share a testing video, and the corresponding output from desktop with us?

Thanks.

Output.mkv (27.8 MB)
This is the testing video file you’ve requested.
This Video is a screen recorded video output from Deepstream installed through docker.

Is there any options for me to tryout ?

Hi,

The sample works well in our environment.
Below is our testing details for your reference:

  • JetPack 4.5.1
  • Replace libnvinfer_plugin.so.7.1.3 from here

Could you give it a try?

Thanks.

I Reflashed my TX2 with Jetpack 4.5.1 (L4T 32.5.1).
Installed the Deepstream and Built the TRT-OSS plugins.
With all set, I tried to run the same MaskRCNN given in this same thread(problem-on-deploying-mrcnn-model-in-tx2).

I didn’t have any segmented portions as you had, also the processing was so much slow.

Log:

> 
> 
> tx2@tx2-desktop:/opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models$ deepstream-app -c deepstream_app_source1_mrcnn.txt
> 
> Using winsys: x11 
> 0:00:00.605211712  9213     0x27013a30 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1716> [UID = 1]: Trying to create engine from model files
> INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
> INFO: [TRT]: Detected 1 inputs and 2 output network tensors.
> 0:04:04.251456356  9213     0x27013a30 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1749> [UID = 1]: serialize cuda engine to file: /home/tx2/Workspace/Models/Segmentation/Instance/Default/model.step-25000.etlt_b1_gpu0_fp16.engine successfully
> INFO: [Implicit Engine Info]: layers num: 3
> 0   INPUT  kFLOAT Input           3x832x1344      
> 1   OUTPUT kFLOAT generate_detections 100x6           
> 2   OUTPUT kFLOAT mask_head/mask_fcn_logits/BiasAdd 100x91x28x28    
> 
> 0:04:04.371930675  9213     0x27013a30 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models/config_infer_primary_mrcnn.txt sucessfully
> 
> Runtime commands:
> 	h: Print this help
> 	q: Quit
> 
> 	p: Pause
> 	r: Resume
> 
> NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
>       To go back to the tiled display, right-click anywhere on the window.
> 
> 
> **PERF:  FPS 0 (Avg)	
> **PERF:  0.00 (0.00)	
> ** INFO: <bus_callback:181>: Pipeline ready
> 
> **PERF:  0.00 (0.00)	
> Opening in BLOCKING MODE
> Opening in BLOCKING MODE 
> NvMMLiteOpen : Block : BlockType = 261 
> NVMEDIA: Reading vendor.tegra.display-size : status: 6 
> NvMMLiteBlockCreate : Block : BlockType = 261 
> ** INFO: <bus_callback:167>: Pipeline running
> 
> **PERF:  0.00 (0.00)	
> **PERF:  1.64 (1.01)	
> **PERF:  1.66 (1.51)	
> **PERF:  1.64 (1.34)	
> **PERF:  1.65 (1.50)	
> **PERF:  1.66 (1.60)	
> **PERF:  1.64 (1.50)	
> **PERF:  1.65 (1.57)	
> **PERF:  1.65 (1.63)	
> **PERF:  1.65 (1.56)	
> **PERF:  1.66 (1.60)	
> **PERF:  1.66 (1.64)	
> **PERF:  1.65 (1.58)	
> **PERF:  1.66 (1.62)	
> **PERF:  1.66 (1.64)	
> **PERF:  1.65 (1.60)	
> **PERF:  1.65 (1.63)	
> **PERF:  1.65 (1.65)	
> 
> **PERF:  FPS 0 (Avg)	
> **PERF:  1.65 (1.61)	
> **PERF:  1.64 (1.63)	
> **PERF:  1.66 (1.65)	
> **PERF:  1.64 (1.62)	
> **PERF:  1.65 (1.64)	
> **PERF:  1.66 (1.65)	
> **PERF:  1.66 (1.62)	
> **PERF:  1.66 (1.64)	
> **PERF:  1.66 (1.65)	
> **PERF:  1.65 (1.63)	
> **PERF:  1.65 (1.64)	
> **PERF:  1.66 (1.62)	
> **PERF:  1.65 (1.63)	
> **PERF:  1.66 (1.64)	
> **PERF:  1.65 (1.62)	
> **PERF:  1.66 (1.64)	
> **PERF:  1.65 (1.65)	
> **PERF:  1.65 (1.63)	
> **PERF:  1.66 (1.64)	
> **PERF:  1.66 (1.65)

My Questions:

  1. Did you deploy the model in a TX2 Hardware?
  2. Can TX2 handle MaskRCNN and infer on it ?
    Because i deployed it in fp16 and it was suffocating and seems unable to Draw any segmented portions on the Video.

Waiting for any updates on this issue

Moving this topic from DS forum into TLT forum.

Can you try to git clone GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream and deploy the maskrcnn model?
PeopleSegNet is actually based on Maskrcnn. So you can refer to the steps in peoplesegnet.

More, please remember to run with SHOW_MASK=1