Hello!
I am trying to get semantic segmentation working on Xavier using the NGC TLT docker container.
I have Jetpack 4.4 installed with Deepstream 5.0 and tensorrt 7.1.3.
On my desktop computer I run this docker image:
nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3
I go to the examples/maskrcnn directory and run through the example Jupyter notebook. I train on COCO using all the defaults. I stopped at 5000 iterations just to test the network. When I run the inference test in the notebook, it works fine.
I convert the model engine file to an etlt file using the code in the notebook. I did the conversion for both 32 bit and 8 bit, in separate files.
I copy the model files to my jetson and the 8 bit calibration file.
I get the same error whether I run from deepstream or the latest tlt-converter. Here is the tlt-converter command and output for the 8-bit model:
/home/nvidia/tlt-converter-7.1-dla/tlt-converter -k MYKEY -d 3,832,1344 -o generate_detections,mask_head/mask_fcn_logits/BiasAdd -c maskrcnn.cal -e trt.int8.engine -b 8 -m 1 -t int8 -i nchw model.step-5000.etlt
[ERROR] UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Network must have at least one output
[ERROR] Network validation failed.
[ERROR] Unable to create engine
Segmentation fault (core dumped)
I get the same error from deepstream:
deepstream-app -c deepstream_app_source1_mrcnn.txt
Using winsys: x11
0:00:00.178862702 9399 0x201ab260 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: UffParser: Validator error: generate_detections: Unsupported operation _GenerateDetection_TRT
parseModel: Failed to parse UFF model
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:02.100517302 9399 0x201ab260 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
Segmentation fault (core dumped)
As I said I am using the example config during training, but here is that file for easy reference:
seed: 123
use_amp: False
warmup_steps: 1000
checkpoint: "/workspace/tlt-experiments/maskrcnn/pretrained_resnet50/tlt_instance_segmentation_vresnet50/resnet50.hdf5"
learning_rate_steps: "[10000, 15000, 20000]"
learning_rate_decay_levels: "[0.1, 0.02, 0.01]"
total_steps: 25000
train_batch_size: 2
eval_batch_size: 4
num_steps_per_eval: 5000
momentum: 0.9
l2_weight_decay: 0.0001
warmup_learning_rate: 0.0001
init_learning_rate: 0.01
data_config{
image_size: "(832, 1344)"
augment_input_data: True
eval_samples: 500
training_file_pattern: "/workspace/tlt-experiments/data/train*.tfrecord"
validation_file_pattern: "/workspace/tlt-experiments/data/val*.tfrecord"
val_json_file: "/workspace/tlt-experiments/data/annotations/instances_val2017.json"
# dataset specific parameters
num_classes: 91
skip_crowd_during_training: True
}
maskrcnn_config {
nlayers: 50
arch: "resnet"
freeze_bn: True
freeze_blocks: "[0,1]"
gt_mask_size: 112
# Region Proposal Network
rpn_positive_overlap: 0.7
rpn_negative_overlap: 0.3
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_min_size: 0.
# Proposal layer.
batch_size_per_im: 512
fg_fraction: 0.25
fg_thresh: 0.5
bg_thresh_hi: 0.5
bg_thresh_lo: 0.
# Faster-RCNN heads.
fast_rcnn_mlp_head_dim: 1024
bbox_reg_weights: "(10., 10., 5., 5.)"
# Mask-RCNN heads.
include_mask: True
mrcnn_resolution: 28
# training
train_rpn_pre_nms_topn: 2000
train_rpn_post_nms_topn: 1000
train_rpn_nms_threshold: 0.7
# evaluation
test_detections_per_image: 100
test_nms: 0.5
test_rpn_pre_nms_topn: 1000
test_rpn_post_nms_topn: 1000
test_rpn_nms_thresh: 0.7
# model architecture
min_level: 2
max_level: 6
num_scales: 1
aspect_ratios: "[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]"
anchor_scale: 8
# localization loss
rpn_box_loss_weight: 1.0
fast_rcnn_box_loss_weight: 1.0
mrcnn_weight_loss_mask: 1.0
}
I don’t know enough about this stuff to have a good idea how to solve it. Any ideas?
Thank you!