Custom MaskRCNN UFF Sample: Assertion `mAnchorsCnt == (int) (mAnchorBoxesHost.size() / 4)' failed

Description

MaskRCNN trained on Resnet50 backbone is giving Assertion error for mAnchrosCnt and mAnchorBoxesHost.size(). Not sure how to fix the issue

Environment

TensorRT Version: 7.2
GPU Type: Jetson Xavier NX Volta GPUs
Nvidia Driver Version: Whatever comes default I guess after Jetpack 4.4.1
CUDA Version: 10.2
CUDNN Version: Jetpack 4.4.1 defaults
Operating System + Version: Ubuntu 18.04 (Jetson)
Python Version (if applicable): 3.7
TensorFlow Version (if applicable): 1.15.0 (not applicable since using UFF)

Relevant Files

https://drive.google.com/drive/u/1/folders/1BSLbmd7UJNbMXak5GFfG_mY59D29ESPc?usp=sharing

Steps To Reproduce

Take the provided .h5 file. Use the provided config.py and mrcnn_to_trt_single.py to convert the .h5 file to .uff. Note that I modified those files since I used resnet50 backbone and 256x256 images. Make sure your environment is similar the one explained in the TensorRT sampleUffMaskRCNN sample (Can’t provide link due to new account). Take the .uff file over there. Take my mrcnn_config.h and sampleUffMaskRCNN.cpp. Make a copy of the Samples folder in Jetson Xavier NX. Put the .h and .cpp file above into this replacing whatever is already there. Compile all samples and run on the images provided in Google Drive. You will get a

[02/22/2021-17:44:26] [I] [TRT] Detected 1 inputs and 2 output network tensors.
sample_uff_mask_rcnn: proposalLayerPlugin.cpp:347: virtual void nvinfer1::plugin::ProposalLayer::configurePlugin(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, const nvinfer1::DataType*, const nvinfer1::DataType*, const bool*, const bool*, nvinfer1::PluginFormat, int): Assertion `mAnchorsCnt == (int) (mAnchorBoxesHost.size() / 4)’ failed

Relevant Github Issues:

Need help on this ASAP. The 2nd issue has my configuration during training in MaskRCNN. All files have been modified to the best of my knowledge. Any help adapting the provided files for my usecase would be greatly appreciated

Hi,

TensorRT plugin layer is open sourced so you can find the assertion below:

The error is caused by the anchor count is different from the expectation.
This is possible especially for a custom model.

Could you trace the source above to see if all the setting in proposalLayer align to your model first?

Thanks.

From the code it seems like it reads it in from the ProposalLayer’s initialization

ProposalLayer::ProposalLayer(const void* data, size_t length)
{
const char d = reinterpret_cast<const char>(data), *a = d;
int prenms_topk = read(d);
int keep_topk = read(d);
float iou_threshold = read(d);
mMaxBatchSize = read(d);
mAnchorsCnt = read(d);
mImageSize = readnvinfer1::Dims3(d);
ASSERT(d == a + length);

The mAnchoBoxesHost.size() is never initialized so I am guessing that’s a default here. The thing is in the source code provided there is no mention of what mAnchorsCnt is: TensorRT/config.py at master · NVIDIA/TensorRT · GitHub

There is also nothing explained in the documentation here: Graph Surgeon — tensorrt 7.2.2.3 documentation

On taking a further look it actually seems to be that since I reduced my 1024 to 256 images, the mAnchorsCnt has reduced down to 256 (inputDims[0].d[0] should be the image width/height I assume and that’s 256). The mAnchorBoxesHost hasn’t scaled accordingly. I am not sure what I should change to fix it though. Nothing gets set in it