Parsing Maskrcnn output with TensorRT in python

I am trying to run inference on a Maskrcnn model i trained using TLT.
I used the pretrained model available here and trained it on COCO dataset following this blog post

I then converted the .tlt model to .etlt followed by conversion to an engine file using tlt-converter.

i know i can use deepstream directly but that’s not my goal here, i want to run inference on python.
I used the following code i attached for inference.

I can parse the bounding boxes just fine, I don’t, however understand how to parse the masks.

i know that the masks resolution is 28x28 and i used PeopleSegNet before and i was able to parse the masks by reshaping the output to (100,2,28,28) but the output this time has dimensions of 7134400 which i don’t know what to reshape it to.

Maybe i am missing something.

Any help would be appreciated

• Hardware: Jetson Xavier AGX
• Network Type: Mask_rcnn
• TLT Version: 3.0
• TensorRT Version: TensorRT 7.1.3 (3.5 KB)

In the blog, the num_classes is 91. This is the number of classes. If there are N categories in the annotation, num_classes should be N+1 (background class).
So, 91 * 28 * 28 * 100 = 7134400

In PeopleSegNet, there is only one class(person). So, num_classes is set to 2.
Then, the dimension is 2 * 28 * 28 * 100

1 Like

I see, Thank you.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.