MaskRCNN on Xavier - UffParser: Validator error Unsupported operation _GenerateDetection_TRT

tlalexander · November 18, 2020, 8:40am

Hello!

I am trying to get semantic segmentation working on Xavier using the NGC TLT docker container.

I have Jetpack 4.4 installed with Deepstream 5.0 and tensorrt 7.1.3.

On my desktop computer I run this docker image:
nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3

I go to the examples/maskrcnn directory and run through the example Jupyter notebook. I train on COCO using all the defaults. I stopped at 5000 iterations just to test the network. When I run the inference test in the notebook, it works fine.

I convert the model engine file to an etlt file using the code in the notebook. I did the conversion for both 32 bit and 8 bit, in separate files.

I copy the model files to my jetson and the 8 bit calibration file.

I get the same error whether I run from deepstream or the latest tlt-converter. Here is the tlt-converter command and output for the 8-bit model:

/home/nvidia/tlt-converter-7.1-dla/tlt-converter -k MYKEY -d 3,832,1344 -o generate_detections,mask_head/mask_fcn_logits/BiasAdd -c maskrcnn.cal -e trt.int8.engine -b 8 -m 1 -t int8 -i nchw model.step-5000.etlt

[ERROR] UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Network must have at least one output
[ERROR] Network validation failed.
[ERROR] Unable to create engine
Segmentation fault (core dumped)

I get the same error from deepstream:
deepstream-app -c deepstream_app_source1_mrcnn.txt

Using winsys: x11 
0:00:00.178862702  9399     0x201ab260 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
parseModel: Failed to parse UFF model
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:02.100517302  9399     0x201ab260 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
Segmentation fault (core dumped)

As I said I am using the example config during training, but here is that file for easy reference:

seed: 123
use_amp: False
warmup_steps: 1000
checkpoint: "/workspace/tlt-experiments/maskrcnn/pretrained_resnet50/tlt_instance_segmentation_vresnet50/resnet50.hdf5"
learning_rate_steps: "[10000, 15000, 20000]"
learning_rate_decay_levels: "[0.1, 0.02, 0.01]"
total_steps: 25000
train_batch_size: 2
eval_batch_size: 4
num_steps_per_eval: 5000
momentum: 0.9
l2_weight_decay: 0.0001
warmup_learning_rate: 0.0001
init_learning_rate: 0.01

data_config{
    image_size: "(832, 1344)"
    augment_input_data: True
    eval_samples: 500
    training_file_pattern: "/workspace/tlt-experiments/data/train*.tfrecord"
    validation_file_pattern: "/workspace/tlt-experiments/data/val*.tfrecord"
    val_json_file: "/workspace/tlt-experiments/data/annotations/instances_val2017.json"

    # dataset specific parameters
    num_classes: 91
    skip_crowd_during_training: True
}

maskrcnn_config {
    nlayers: 50
    arch: "resnet"
    freeze_bn: True
    freeze_blocks: "[0,1]"
    gt_mask_size: 112
        
    # Region Proposal Network
    rpn_positive_overlap: 0.7
    rpn_negative_overlap: 0.3
    rpn_batch_size_per_im: 256
    rpn_fg_fraction: 0.5
    rpn_min_size: 0.

    # Proposal layer.
    batch_size_per_im: 512
    fg_fraction: 0.25
    fg_thresh: 0.5
    bg_thresh_hi: 0.5
    bg_thresh_lo: 0.

    # Faster-RCNN heads.
    fast_rcnn_mlp_head_dim: 1024
    bbox_reg_weights: "(10., 10., 5., 5.)"

    # Mask-RCNN heads.
    include_mask: True
    mrcnn_resolution: 28

    # training
    train_rpn_pre_nms_topn: 2000
    train_rpn_post_nms_topn: 1000
    train_rpn_nms_threshold: 0.7

    # evaluation
    test_detections_per_image: 100
    test_nms: 0.5
    test_rpn_pre_nms_topn: 1000
    test_rpn_post_nms_topn: 1000
    test_rpn_nms_thresh: 0.7

    # model architecture
    min_level: 2
    max_level: 6
    num_scales: 1
    aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]"
    anchor_scale: 8

    # localization loss
    rpn_box_loss_weight: 1.0
    fast_rcnn_box_loss_weight: 1.0
    mrcnn_weight_loss_mask: 1.0
}

I don’t know enough about this stuff to have a good idea how to solve it. Any ideas?

Thank you!

Morganh · November 19, 2020, 7:36am

Where did you download the /home/nvidia/tlt-converter-7.1-dla/tlt-converter?
Can you paste the link?

tlalexander · November 19, 2020, 8:59am

Yes I found it here:

which was linked here:

Of course the same error happens with deepstream, which isn’t using that binary (I didn’t add that binary to my path or anything).

Thanks!

Morganh · November 19, 2020, 10:07am

Please try below version of tlt-converter-7.1
https://developer.nvidia.com/tlt-converter-trt71

tlalexander · November 19, 2020, 10:28am

Thank you. I just tried it and I get the same error message as before.

/home/nvidia/Downloads/tlt-converter -k MY_KEY -d 3,832,1344 -o generate_detections,mask_head/mask_fcn_logits/BiasAdd -c maskrcnn.cal -e trt.int8.engine -b 8 -m 1 -t int8 -i nchw model.step-5000.etlt
[ERROR] UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Network must have at least one output
[ERROR] Network validation failed.
[ERROR] Unable to create engine
Segmentation fault (core dumped)

tlalexander · November 19, 2020, 11:25am

I re-flashed my Jetson just to make sure. I flashed it with Jetpack 4.4 and Deepstream 5.0 as before. I get the same error message when I run tlt-converter as above, using the tlt-converter binary linked in your post above.

Morganh · November 20, 2020, 2:53am

Can you try below experiments?

try to generate fp16 trt engine instead of int8 engine
try to follow GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream

### 2. Download Models
cd deepstream_tlt_apps/
wget https://nvidia.box.com/shared/static/8k0zpe9gq837wsr0acoy4oh3fdf476gq.zip -O models.zip
unzip models.zip
rm models.zip

then use its mrcnn model instead to try again.

tlalexander · November 20, 2020, 10:47am

Thank you.

I tried to generate fp16 on my desktop, but it said unsupported. Perhaps you meant on the Jetson.

However I moved to step 2. I downloaded the models, and copied the maskrcnn model from there to the Jetson. I tried to convert the model after updating the key. I then got the same “Unsupported Operation” error as before, but with MultilevelProposeROI instead of GenerateDetection_TRT. I searched for MultilevelProposeROI, and found this document:
https://docs.nvidia.com/metropolis/TLT/pdf/Transfer-Learning-Toolkit-Getting-Started-Guide-IVA.pdf

I searched for MultilevelProposeROI and found the following sentence in that PDF:

TensorRT OSS build is required for FasterRCNN, SSD, DSSD, YOLOv3, RetinaNet, and MaskRCNN models.

WOW. That’s not what the deepstream_tlt_apps github repo says here:

Instead that README says:

Prerequisites: TensorRT OSS (release/7.x branch) This is ONLY needed when running SSD , DSSD , RetinaNet and YOLOV3 models because BatchTilePlugin required by these models is not supported by TensorRT7.x native package.

So because I saw that TensorRT OSS is only needed for SSD, DSSD, RetinaNet, and YOLOV3, I believed it was not needed for MaskRCNN. Furthermore, I did see a binary for TensorRT OSS but that said version 7.0.0 and my jetson has 7.1.3, so I worried it was not compatible anyway. (however the binary I built from source also says 7.0.0 and it works)

However, now my problem appears solved. I can convert the models on my Jetson (both the one downloaded from box.com and the one from the NGC transfer learning docker container that I trained on COCO).

It seems that the documentation on deepstream_tlt_apps should be updated. I read that multiple times before and if it said “MaskRCNN” I would have tried it.

The jupyter notebook in the NGC also does not mention that the Jetson needs TensorRT OSS, so that file should be updated. It is in the container nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3 at “notebooks/examples/maskrcnn/maskrcnn.ipynb”.

Thank you for helping narrow down the issue.

Morganh · November 23, 2020, 3:20am

Thanks for your info. Appreciate it!
I will sync with internal team to get document improved.

tlalexander · December 13, 2020, 11:04pm

FYI I am having a related problem where I cannot run MaskRCNN in the deepstream NGC container because it does not have TensorRT OSS, and getting that to compile in the container is running me up against a few road blocks. I need CUDA but installing the cuda package from apt gave me trouble. Also the version of cmake available on that apt system is too old, though installing cmake from source is easy enough. I am still working through this, but the need to compile TensorRT OSS is a pain to say the least. I will probably get this working, I just wanted to share this info!

Morganh · December 14, 2020, 4:16pm

Which deepstream NGC container did you use? If it is blocking you, could you search help from Deepstream forum or google?

tlalexander · December 15, 2020, 7:47am

Thanks. I was using the 5.0.1-20.09-devel deepstream container.

I never got compiling in that container to work (after trying for a while). Instead I noticed that the TensorRT-OSS instructions include their own container to compile inside of (I did not need to do this previously on my Xavier). I compiled this time in the recommended container, and then copied the resulting binaries in to the right location in my deepstream container. It took a fair bit of trial and error before I finally got the right settings (must match TensorRT OSS git tag, CUDA version, and Ubuntu version with the deepstream container), but ultimately that worked for me. Now I can validate my deepstream networks on video samples on my desktop, no need to involve the Xavier all the time.

It did take me working all through a day to get it all sorted out. This seems like it could hamper anyone trying the MaskRCNN with deepstream! Happy its working now but I was pretty frustrating to work through.

Topic		Replies	Views
transfert learning toolkit-> export model TAO Toolkit	11	3590	October 12, 2021
Need working example of deployment of Mask RCNN to Jetson TAO Toolkit	8	925	November 18, 2022
There is a error when run deepstream-mrcnn-app DeepStream SDK	60	3101	October 12, 2021
Deepstream 6.01 - Python app error - USB Camera DeepStream SDK	7	656	June 1, 2022
Deepstream doesn't give expected Mask-RCNN output DeepStream SDK	26	3015	February 8, 2022
RetinaNet on Jetson Nano DeepStream SDK	7	957	October 12, 2021
Docker instantiation fails when running "tao detectnet_v2" on Xavier NX Jetson AGX Xavier docker	5	558	October 5, 2022
Deepstream can’t create the .engine from the .etlt, using tlt3.0 a custom Mask-Rcnn model [2] TAO Toolkit	13	705	April 28, 2023
TF-TRT conversion is broken on 32.7.1 Jetson AGX Xavier tensorflow , docker	11	1589	April 6, 2022
LPRNet gives different results when running on Jetson TAO Toolkit jetson	28	765	March 27, 2024

MaskRCNN on Xavier - UffParser: Validator error Unsupported operation _GenerateDetection_TRT

Related topics