How to access segmentation mask of Maskrcnn model for post-processing

• Hardware Platform: Jetson Xavier NX
• DeepStream Version: 6.0
• JetPack Version: 4.6 (rev.3)
• TensorRT Version: 8.0
• Issue Type: question/ can’t find details in documentation

Hi, I am trying to extract an instance segment mask generated from a custom maskrcnn model to pass to a post-processing algorithm inside a deepstream pipeline. I can access some details of the mask like size, height, width, etc, although I cannot figure out how to retrieve the mask itself into some type I can work with. I would expect the mask to be some sort of 2D array of floats to represent polygon coordinates, or a binary mask, etc? How can I transform mask_params.data (of type float *) into something I can feed to a post-processor.

I have a callback function on the src pad of nvinfer to print out mask details, and I have attached my nvinfer config file:

config_infer_primary_mrcnn_tao.txt (2.4 KB)

for (l_frame = batch_meta->frame_meta_list; l_frame != NULL; l_frame = l_frame->next) {

        NvDsFrameMeta *frame_meta = (NvDsFrameMeta * )(l_frame->data);
        num_objects = frame_meta->num_obj_meta;
        g_print("There are %d objects detected in this frame\n", num_objects);
        counter = 0;

        for (l_obj = frame_meta->obj_meta_list; l_obj != NULL; l_obj = l_obj->next) {

            NvDsObjectMeta *object_meta =  (NvDsObjectMeta * )(l_frame->data);
            NvOSD_MaskParams mask_meta = object_meta->mask_params;

            float mask_size = mask_meta.size;
            float mask_height = mask_meta.height;
            float mask_width = mask_meta.width;

            // how do I retrieve the mask from a (float *)?
            float *mask_data = mask_meta.data;

            counter++;
            g_print("Object: %d mask size:%lf\n", counter, mask_size);
        }
    }

Thank you in advance for any help you can provide!

Sorry for the late.
If you want to do post process with mask data, you can refer to any model post process, for example, objectDetector_FasterRCNN under sources/
and the mask is 2D arrary of floats to represents polygon coordinates.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.