Interpreting output of MaskRCNN from TLT to TRT

Description

Hello! I’m trying to run inference on a trt generated engine from the tutorial of TLT MaskRCNN from here:
https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/instance_segmentation/mask_rcnn.html

I’m trying to interpret the output that TensorRT gives but i can’t find any info. Here is the script i’m using to run inference.

Currently the model has been trained for inputs of shape 256,448 and i’m getting 2 outputs of dimensiones (1,600) and (1,156800)

After some experimentation i was able to understand that the first 6 values of the output of (1,600) were the bounding box, class and confidence of an object of the image, but the rest were just zeros and -1 in class (Does this mean that i can just do inference on 100 objects on an image?)

I really don’t know where the mask is on the other output, i know it’s divisible by 448, but not by 256.

I’m on a Jetson Xavier NX running jetpack 4.5.1 with TensorRT 7.1.3

Any help would be appreciated.
maskrcnn_infer.py (3.7 KB)

Hi @damiandeza,

We recommend you to post your concern on TLT related forum to get better help.
Please find link here.

Thank you.

Very well, i shall remake the post there.