Save image sequence with Maskrcnn inference masks in python

Please provide complete information as applicable to your setup.

**• Hardware Platform (Jetson / GPU)**Jetson Xavier AGX
• DeepStream Version5.1.0
**• JetPack Version (valid for Jetson only)**4.5-b129
• TensorRT Version7.1.3-1
**• Issue Type( questions, new requirements, bugs)**questions

I trained a Maskrcnn model with my own data on TLT by following this link:
https://developer.nvidia.com/blog/training-instance-segmentation-models-using-maskrcnn-on-the-transfer-learning-toolkit/

The model has been deployed with deepstream-app sucessfully by following the example deepstream_app_source1_mrcnn.txt and config_infer_primary_mrcnn.txt.

My reqirement is to get a output video or image sequences with the inference masks on it, but without the bounding boxs and other information. Also I need to change the color and opacity of the masks to sth like white. So I refer to the deepstream_python_apps/apps/deepstream-ssd-parser at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub to register a pgie srcpad probe function to get the tensor meta and transfer it to numpy. But I don’t know how to save them as images as I required. Would you please give me some support?

Here’s my pgie_src_pad_buffer_probe:

def pgie_src_pad_buffer_probe(pad, info, u_data):

    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    # Retrieve batch metadata from the gst_buffer
    # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
    # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list

    detection_params = DetectionParam(CLASS_NB, ACCURACY_ALL_CLASS)
    box_size_param = BoxSizeParam(IMAGE_HEIGHT, IMAGE_WIDTH,
                                  MIN_BOX_WIDTH, MIN_BOX_HEIGHT)
    nms_param = NmsParam(TOP_K, IOU_THRESHOLD)

    label_names = get_label_names_from_file("mrcnn_labels.txt")

    while l_frame is not None:
        try:
            # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta
            # The casting also keeps ownership of the underlying memory
            # in the C code, so the Python garbage collector will leave
            # it alone.
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        l_user = frame_meta.frame_user_meta_list
        frame_number = frame_meta.frame_num

        while l_user is not None:
            try:
                # Note that l_user.data needs a cast to pyds.NvDsUserMeta
                # The casting also keeps ownership of the underlying memory
                # in the C code, so the Python garbage collector will leave
                # it alone.
                user_meta = pyds.NvDsUserMeta.cast(l_user.data)
            except StopIteration:
                break

            if (
                    user_meta.base_meta.meta_type
                    != pyds.NvDsMetaType.NVDSINFER_TENSOR_OUTPUT_META
            ):
                continue

            tensor_meta = pyds.NvDsInferTensorMeta.cast(user_meta.user_meta_data)
            
            # INFO: [Implicit Engine Info]: layers num: 3
            # 0   INPUT  kFLOAT Input           3x832x1344      
            # 1   OUTPUT kFLOAT generate_detections 100x6        pyds.get_nvds_LayerInfo(tensor_meta, 0)   
            # 2   OUTPUT kFLOAT mask_head/mask_fcn_logits/BiasAdd 100x14x28x28 

            frame_outputs = []
            for i in range(tensor_meta.num_output_layers):
                layer = pyds.get_nvds_LayerInfo(tensor_meta, i)
                # Convert NvDsInferLayerInfo buffer to numpy array
                ptr = ctypes.cast(pyds.get_ptr(layer.buffer), ctypes.POINTER(ctypes.c_float))
                v = np.array(np.ctypeslib.as_array(ptr, shape=(layer.dims.numElements,)), copy=True)
                #print(v)
                frame_outputs.append(v)

            # # Boxes in the tensor meta should be in network resolution which is
            # # found in tensor_meta.network_info. Use this info to scale boxes to
            # # the input frame resolution.
            # layers_info = []

            # for i in range(tensor_meta.num_output_layers):
            #     layer = pyds.get_nvds_LayerInfo(tensor_meta, i)
            #     layers_info.append(layer)

            # frame_object_list = nvds_infer_parse_custom_tf_ssd(
            #     layers_info, detection_params, box_size_param, nms_param
            # )
            # try:
            #     l_user = l_user.next
            # except StopIteration:
            #     break

            # for frame_object in frame_object_list:
            #     add_obj_meta_to_frame(frame_object, batch_meta, frame_meta, label_names)

        try:
            l_frame = l_frame.next
        except StopIteration:
            break
    return Gst.PadProbeReturn.OK

If you want to draw the mask on the original video, you need to get the video frame data too. Python numpy is not a good way to do the job, the performance is too poor.

Please refer to c/c++ sample /opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-mrcnn-app

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_C_Sample_Apps.html

Hi,
I removed the msgconv and msgbroker from pipeline in deepstream_mrcnn_test.cpp and had a try run. But I got the error as below. It seems the model file is missing. Where can I get it?

Now playing: ../../../../samples/streams/sample_qHD.mp4

Using winsys: x11 
Opening in BLOCKING MODE
Opening in BLOCKING MODE 
ERROR: Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-mrcnn-test/../../../../samples/models/tlt_pretrained_models/mrcnn/mask_rcnn_resnet50.etlt_b1_gpu0_int8.engine open error
0:00:01.404554848 23406   0x558b3bfc40 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1691> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-mrcnn-test/../../../../samples/models/tlt_pretrained_models/mrcnn/mask_rcnn_resnet50.etlt_b1_gpu0_int8.engine failed
0:00:01.404810688 23406   0x558b3bfc40 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1798> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-mrcnn-test/../../../../samples/models/tlt_pretrained_models/mrcnn/mask_rcnn_resnet50.etlt_b1_gpu0_int8.engine failed, try rebuild
0:00:01.404882496 23406   0x558b3bfc40 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1716> [UID = 1]: Trying to create engine from model files
parseModel: Failed to open TLT encoded model file /opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-mrcnn-test/../../../../samples/models/tlt_pretrained_models/mrcnn/mask_rcnn_resnet50.etlt
ERROR: failed to build network since parsing model errors.
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:01.406038112 23406   0x558b3bfc40 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1736> [UID = 1]: build engine file failed

There is README file in the folder. And if you have read the README file or the source code, you should know that this sample only accept H264 raw data as input. It is open source, the docuemnt and the source code will help you.

Please make sure you are familiar with the basic knowledge and coding skills of gstreamer before you start with deepstream. GStreamer: open source multimedia framework

OK. I’ll have another try. But if it’s ok for me to remove msgconv and msgbroker if I don’t need Azure(mqtt) IOThub, kafka or AMQP broker(rabbitmq)?

If your modification is correct, it can.

Please make sure you are familiar with the basic knowledge and coding skills of gstreamer before you start with deepstream. GStreamer: open source multimedia framework

The error as above was caused by missing model indeed. I got the models by:
wget https://nvidia.box.com/shared/static/8k0zpe9gq837wsr0acoy4oh3fdf476gq.zip -O models.zip

The problem now is that there’s only bbox and text on the video, but no masks?


Hi, I’ve looked into the deepstream_mrcnn_test.cpp. It looks like the vehicle/person mask polygon has been generated into the user meta data and been added to the frame meta in osd sink pad. The output-instance-mask in config is set to 1. Why there’s no masks draw on the frame?

I’ve also added the following osd config in dsmrcnn_pgie_config.txt. But it still has no masks shown. I need some help. @Fiona.Chen

[osd]
enable=1
gpu-id=0
border-width=3
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
display-mask=1
display-bbox=0
display-text=0

Can anyone help me with this issue?

Please set nvdsosd plugin property “display-mask” to 1 if you want to display mask.

I’ve already set display-mask to 1…