MaskRCNN on Xavier - UffParser: Validator error Unsupported operation _GenerateDetection_TRT

Hello!

I am trying to get semantic segmentation working on Xavier using the NGC TLT docker container.

I have Jetpack 4.4 installed with Deepstream 5.0 and tensorrt 7.1.3.

On my desktop computer I run this docker image:
nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3

I go to the examples/maskrcnn directory and run through the example Jupyter notebook. I train on COCO using all the defaults. I stopped at 5000 iterations just to test the network. When I run the inference test in the notebook, it works fine.

I convert the model engine file to an etlt file using the code in the notebook. I did the conversion for both 32 bit and 8 bit, in separate files.

I copy the model files to my jetson and the 8 bit calibration file.

I get the same error whether I run from deepstream or the latest tlt-converter. Here is the tlt-converter command and output for the 8-bit model:

/home/nvidia/tlt-converter-7.1-dla/tlt-converter -k MYKEY -d 3,832,1344 -o generate_detections,mask_head/mask_fcn_logits/BiasAdd -c maskrcnn.cal -e trt.int8.engine -b 8 -m 1 -t int8 -i nchw model.step-5000.etlt

[ERROR] UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Network must have at least one output
[ERROR] Network validation failed.
[ERROR] Unable to create engine
Segmentation fault (core dumped)

I get the same error from deepstream:
deepstream-app -c deepstream_app_source1_mrcnn.txt

Using winsys: x11 
0:00:00.178862702  9399     0x201ab260 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
parseModel: Failed to parse UFF model
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:02.100517302  9399     0x201ab260 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
Segmentation fault (core dumped)

As I said I am using the example config during training, but here is that file for easy reference:

seed: 123
use_amp: False
warmup_steps: 1000
checkpoint: "/workspace/tlt-experiments/maskrcnn/pretrained_resnet50/tlt_instance_segmentation_vresnet50/resnet50.hdf5"
learning_rate_steps: "[10000, 15000, 20000]"
learning_rate_decay_levels: "[0.1, 0.02, 0.01]"
total_steps: 25000
train_batch_size: 2
eval_batch_size: 4
num_steps_per_eval: 5000
momentum: 0.9
l2_weight_decay: 0.0001
warmup_learning_rate: 0.0001
init_learning_rate: 0.01

data_config{
    image_size: "(832, 1344)"
    augment_input_data: True
    eval_samples: 500
    training_file_pattern: "/workspace/tlt-experiments/data/train*.tfrecord"
    validation_file_pattern: "/workspace/tlt-experiments/data/val*.tfrecord"
    val_json_file: "/workspace/tlt-experiments/data/annotations/instances_val2017.json"

    # dataset specific parameters
    num_classes: 91
    skip_crowd_during_training: True
}

maskrcnn_config {
    nlayers: 50
    arch: "resnet"
    freeze_bn: True
    freeze_blocks: "[0,1]"
    gt_mask_size: 112
        
    # Region Proposal Network
    rpn_positive_overlap: 0.7
    rpn_negative_overlap: 0.3
    rpn_batch_size_per_im: 256
    rpn_fg_fraction: 0.5
    rpn_min_size: 0.

    # Proposal layer.
    batch_size_per_im: 512
    fg_fraction: 0.25
    fg_thresh: 0.5
    bg_thresh_hi: 0.5
    bg_thresh_lo: 0.

    # Faster-RCNN heads.
    fast_rcnn_mlp_head_dim: 1024
    bbox_reg_weights: "(10., 10., 5., 5.)"

    # Mask-RCNN heads.
    include_mask: True
    mrcnn_resolution: 28

    # training
    train_rpn_pre_nms_topn: 2000
    train_rpn_post_nms_topn: 1000
    train_rpn_nms_threshold: 0.7

    # evaluation
    test_detections_per_image: 100
    test_nms: 0.5
    test_rpn_pre_nms_topn: 1000
    test_rpn_post_nms_topn: 1000
    test_rpn_nms_thresh: 0.7

    # model architecture
    min_level: 2
    max_level: 6
    num_scales: 1
    aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]"
    anchor_scale: 8

    # localization loss
    rpn_box_loss_weight: 1.0
    fast_rcnn_box_loss_weight: 1.0
    mrcnn_weight_loss_mask: 1.0
}

I don’t know enough about this stuff to have a good idea how to solve it. Any ideas?

Thank you!

Where did you download the /home/nvidia/tlt-converter-7.1-dla/tlt-converter?
Can you paste the link?

Yes I found it here:
https://developer.nvidia.com/assets/TLT/Secure/tlt-converter-7.1-dla.zip
which was linked here:

Of course the same error happens with deepstream, which isn’t using that binary (I didn’t add that binary to my path or anything).

Thanks!

Please try below version of tlt-converter-7.1
https://developer.nvidia.com/tlt-converter-trt71

Thank you. I just tried it and I get the same error message as before.

/home/nvidia/Downloads/tlt-converter -k MY_KEY -d 3,832,1344 -o generate_detections,mask_head/mask_fcn_logits/BiasAdd -c maskrcnn.cal -e trt.int8.engine -b 8 -m 1 -t int8 -i nchw model.step-5000.etlt
[ERROR] UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Network must have at least one output
[ERROR] Network validation failed.
[ERROR] Unable to create engine
Segmentation fault (core dumped)

I re-flashed my Jetson just to make sure. I flashed it with Jetpack 4.4 and Deepstream 5.0 as before. I get the same error message when I run tlt-converter as above, using the tlt-converter binary linked in your post above.

Can you try below experiments?

  1. try to generate fp16 trt engine instead of int8 engine
  2. try to follow https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/
### 2. Download Models
cd deepstream_tlt_apps/
wget https://nvidia.box.com/shared/static/8k0zpe9gq837wsr0acoy4oh3fdf476gq.zip -O models.zip
unzip models.zip
rm models.zip

then use its mrcnn model instead to try again.

Thank you.

I tried to generate fp16 on my desktop, but it said unsupported. Perhaps you meant on the Jetson.

However I moved to step 2. I downloaded the models, and copied the maskrcnn model from there to the Jetson. I tried to convert the model after updating the key. I then got the same “Unsupported Operation” error as before, but with MultilevelProposeROI instead of GenerateDetection_TRT. I searched for MultilevelProposeROI, and found this document:

I searched for MultilevelProposeROI and found the following sentence in that PDF:

TensorRT OSS build is required for FasterRCNN, SSD, DSSD, YOLOv3, RetinaNet, and MaskRCNN models.

WOW. That’s not what the deepstream_tlt_apps github repo says here:

Instead that README says:

Prerequisites: TensorRT OSS (release/7.x branch) This is ONLY needed when running SSD , DSSD , RetinaNet and YOLOV3 models because BatchTilePlugin required by these models is not supported by TensorRT7.x native package.

So because I saw that TensorRT OSS is only needed for SSD, DSSD, RetinaNet, and YOLOV3, I believed it was not needed for MaskRCNN. Furthermore, I did see a binary for TensorRT OSS but that said version 7.0.0 and my jetson has 7.1.3, so I worried it was not compatible anyway. (however the binary I built from source also says 7.0.0 and it works)

However, now my problem appears solved. I can convert the models on my Jetson (both the one downloaded from box.com and the one from the NGC transfer learning docker container that I trained on COCO).

It seems that the documentation on deepstream_tlt_apps should be updated. I read that multiple times before and if it said “MaskRCNN” I would have tried it.

The jupyter notebook in the NGC also does not mention that the Jetson needs TensorRT OSS, so that file should be updated. It is in the container nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3 at “notebooks/examples/maskrcnn/maskrcnn.ipynb”.

Thank you for helping narrow down the issue.

Thanks for your info. Appreciate it!
I will sync with internal team to get document improved.

1 Like