Issue with parallel branches in DeepStream pipeline

• Hardware Platform: Jetson Orin NX 16 GB
• DeepStream Version: 7.0
• JetPack Version: 6.0
• TensorRT Version: 8.6.2.3
• Issue Type: question

Hello,

I need to create a DeepStream parallel pipeline with 4 branches, similar to the one shown here:
deepstream_parallel_inference_app.

I am using Python with the DeepStream PyBindings. My input is a single tiled video file (1920x2160) from which I extract 2 video streams (using nvvidconv for cropping) and then merge them via an nvstreammux with batch size = 2, in order to emulate multiple sources coming from different cameras.

I tried to replicate the diagram shown in the example, but I encountered some issues.

Specifically, when 3 branches of the pipeline (except for the one performing inference) have queues connected directly to fakesink (as in the configuration below), the pipeline runs correctly in the sense that the first branch performs inference as expected.

However, when I enable even a single additional parallel branch by adding nvstreamdemux -> nvstreammux -> fakesink as in the configuration below:

the entire pipeline stalls.

I’m not sure what the issue is. I suspect that the tee is not creating 4 separate copies of the batch of 2 frames, but is simply forwarding it, so when one branch consumes the buffer, the others are blocked, causing the pipeline to stall. Is this correct?

How can I resolve this issue? What is the correct structure to use in order to have 4 models running in parallel in Python?

Thank you in advance for your guidance.

The actual graph of the parallel sample is

We will update the graph soon.

1 Like

Hi Fiona,

thanks a lot for your reply! After replicating the schema you suggested, everything is finally working as expected.

I still have two questions though:

  1. I don’t quite understand why the version with multiple nvstreamdemux elements doesn’t work, while the one with a single nvstreamdemux does. I think it would be useful to clarify this in order to avoid similar mistakes in the future.
  2. Why is one of the branches from the tee (right after the nvstreammux that connects all the sources) also fed into the metamux?

Originally, the nvstreammux and nvstreamdemux are not designed to support cascaded scenario. For parallel case, we have improved them to support it. For the plugins are proprietary, we can’t talk about the details.

The direct branch is used as the reference branch, the metadata from other branches will be merged into the refernce branch metadata.

1 Like

Ok, thanks again.

I have one last question, still related to parallel execution and the use of metamux.
Basically, I need to synchronize the output of my 4 models, since I have to use the segmentation masks from all of them to perform calculations that depend on all four together.

If I add a probe function downstream of the metamux, is it guaranteed that it will always receive (via metadata) the 4 segmentation masks at once? Or do I need to introduce an additional synchronization mechanism — and if so, which one?

In addition, I noticed that segmentation metadata (pyds.NvDsInferSegmentationMeta) does not provide a unique_id attribute. This makes it difficult to identify which model generated a given mask, and leads to an error like:

AttributeError: 'pyds.NvDsInferSegmentationMeta' object has no attribute 'unique_id'

What is the recommended way to associate segmentation metadata with the corresponding model (e.g. via unique_id or another mechanism)?

Do you mean your models are all segmentation models? Are you sure they are segmentation models but not instance segmentation models?

Yes, all my models are semantic segmentation models, and in the configuration file I set:

# Type of network (0=Detector, 1=Classifier, 2=Segmentation, 3=Instance Segmentation)
network-type=2

The semantic segmentation model output mask is for the whole frame, it is attached as the NvDsInferSegmentationMeta in the user meta of the NvDsFrameMeta. Since you have four segmentation models, please assign different gie-unique-id to the different model in the nvinfer configuration file. After metamux, you can enumerate all NvDsInferSegmentationMeta from the frame_user_meta_list in NvDsFrameMeta.
The unique_id in NvDsInferSegmentationMeta will tell you which model generates this NvDsInferSegmentationMeta.

I’m already doing the following: I’ve assigned different gie-unique-ids and I’m using the probe function below to analyze semantic segmentation metadata:

def debug1_pad_buffer_probe(pad: Gst.Pad, 
                            info: Gst.PadProbeInfo, 
                            u_data: int) -> Gst.PadProbeReturn:
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        logging.error("Unable to get GstBuffer")
        return

    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list

    while l_frame is not None:
        try:
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        l_user = frame_meta.frame_user_meta_list
        while l_user is not None:
            try:
                seg_user_meta = pyds.NvDsUserMeta.cast(l_user.data)
            except StopIteration:
                break

            if seg_user_meta and seg_user_meta.base_meta.meta_type == pyds.NVDSINFER_SEGMENTATION_META:
                try:
                    segmeta = pyds.NvDsInferSegmentationMeta.cast(seg_user_meta.user_meta_data)
                except StopIteration:
                    break

                dt = datetime.fromtimestamp(frame_meta.ntp_timestamp / 1e9).strftime("%Y-%m-%d %H:%M:%S.%f")
                logging.info(f"[FAKESINK] Stream: {frame_meta.pad_index}, Frame: {frame_meta.frame_num}, "
                             f"Timestamp: {dt}, PTS: {frame_meta.buf_pts}, SegmentationMeta: {segmeta.classes}")
                
                print(segmeta.unique_id)

            try:
                l_user = l_user.next
            except StopIteration:
                break

        try:
            l_frame = l_frame.next
        except StopIteration:
            break

    return Gst.PadProbeReturn.OK

However, I get the following error:

Traceback (most recent call last):
  File “/opt/deepstream-app/app/main.py”, line 227, in debug1_pad_buffer_probe
    print(segmeta.unique_id)
AttributeError: 'pyds.NvDsInferSegmentationMeta' object has no attribute 'unique_id'

Additionally, I’m not sure about how metamux works. If I attach a probe function to the source pad of the metamux, can I be certain that all 4 segmentation masks are available at that point? Some models run faster than others, so I’m concerned about synchronization.

This is a bug of the pyds. Please add “.def_readonly(“unique_id”, &NvDsInferSegmentationMeta::unique_id)” in deepstream_python_apps/bindings/src/bindnvdsinfer.cpp at v1.1.11 · NVIDIA-AI-IOT/deepstream_python_apps, rebuild pyds and reinstall pyds.

1 Like

Thanks, I’ll give it a try and let you know.

Regarding inference synchronization, could you clarify whether I need to implement a separate mechanism (if yes, how?), or does the src pad of the metamux already provide all four inferences from the different branches synchronized? As I mentioned, some segmentation models are faster than others.

If the segmentation mask is for the frame content, why do you think the mask will not be synchronized with the frame? I’m confused. What do you want to synchronize?

I want to synchronize the segmentation masks from four models. My input file is a 2160x1920 .h265 video, which I split into two streams, 1080x1920 each (called A and B). Stream A is fed into two segmentation models, X and Y, while stream B is fed into two other segmentation models, W and Z. These models have different inference speeds, and all branches are eventually merged at a metamux.

I would like to understand: if I attach a probe function to the src pad of the metamux, is it guaranteed that all inferences will be ready when the function is invoked? Or is it possible that some inferences are ready while others are not yet complete, meaning I cannot perform the calculations I need because the segmentation mask from the slowest model is still missing?

Yes, the inference result is attached with the frame as the metadata.

No. The downstream elements can only get GstBuffers from the upstream elements, metamux is the downstream element in the pipeline.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.