Metadata Mismatch in Parallel Inference Pipeline with Frame Probe (DeepStream 7.1)

Environment

  • DeepStream Version: 7.1
  • OS: Ubuntu 22.04
  • GPU: NVIDIA RTX 3090 (24 GB)
  • Driver Version: 535.230.02
  • CUDA Version: 12.2
  • CPU: Intel i9 14th Gen
  • RAM: 128 GB
  • Python: 3.10.12 (with pyds bindings)

Problem Description

I’m implementing a parallel inference pipeline using DeepStream 7.1, inspired by the build_parallel_pipeline approach, where multiple streams are processed with different PGIE models assigned per stream. The pipeline uses nvstreammux, nvstreamdemux, per-stream inference chains (with nvinfer and nvtracker), and a second nvstreammux for output remultiplexing. For visualization in development mode, I use nvmultistreamtiler to display all streams, and the metadata (bounding boxes, etc.) is correctly rendered on the tiled output.

However, when I add a frame probe on the nvdsosd source pad to extract individual source frames and metadata (using the _frame_probe function below), I encounter a metadata mismatch issue. Specifically, bounding boxes detected for one stream (e.g., stream 2) are incorrectly displayed on another stream (e.g., stream 1). This issue does not occur in the tiled output, only when extracting frames via the probe.

Pipeline Structure

The parallel pipeline is constructed as follows:

  • Input Multiplexer: nvstreammux (input_mux) receives multiple input streams (e.g., from file sources).
  • Stream Demultiplexer: nvstreamdemux splits the batched stream into individual streams based on source_id.
  • Per-Stream Inference Chains:
    • Each stream has a queue to handle buffering.
    • One or more nvinfer (PGIE) elements per stream, configured with different models (e.g., PeopleNet, TrafficCamNet) and unique unique-id properties.
    • An nvtracker per stream for object tracking.
  • Output Multiplexer: A second nvstreammux (output_mux) recombines the processed streams.
  • Post-Processing:
    • nvvideoconvert (conv_post) for format conversion.
    • In development mode, nvmultistreamtiler for tiled display (rows and columns based on number of sources).
    • capsfilter to enforce RGBA format.
    • nvdsosd (osd_parallel) for rendering bounding boxes and text.
    • Display sink: nveglglessink (or nv3dsink on Jetson) in dev mode, or fakesink otherwise.
  • Probe Location: The frame probe is attached to the src pad of nvdsosd to extract individual frames and metadata.

The pipeline structure can be summarized as:

[source(s)] -> nvstreammux -> nvstreamdemux -> [queue -> nvinfer -> nvtracker] -> nvstreammux -> [nvvideoconvert -> capsfilter -> nvdsosd -> sink]

Relevant Probe Code

Here’s the simplified _frame_probe function used to extract frames and metadata:

def _frame_probe(self, pad, info, u_data):
    gst_buf = info.get_buffer()
    if gst_buf is None:
        return Gst.PadProbeReturn.OK

    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buf))
    if batch_meta is None:
        return Gst.PadProbeReturn.OK

    l_frame = batch_meta.frame_meta_list
    while l_frame:
        fmeta = pyds.NvDsFrameMeta.cast(l_frame.data)
        sid = fmeta.source_id
        bid = fmeta.batch_id

        # Extract GPU surface for individual source
        dtype, shape, strides, dataptr, size = pyds.get_nvds_buf_surface_gpu(hash(gst_buf), bid)
        frame_gpu = cp.ndarray(shape, dtype=dtype, memptr=cp.cuda.MemoryPointer(cp.cuda.UnownedMemory(dataptr, size, owner=gst_buf), 0), strides=strides, order='C')
        frame_cpu = cp.asnumpy(frame_gpu)
        frame_bgr = cv2.cvtColor(frame_cpu, cv2.COLOR_RGBA2BGR)

        # Store frame per source ID
        self.result_frames[sid] = frame_bgr

        if self.frame_cb:
            meta = {"frame_num": fmeta.frame_num, "num_objects": fmeta.num_obj_meta}
            self.frame_cb(sid, frame_bgr, meta)
        l_frame = l_frame.next

    return Gst.PadProbeReturn.OK

This probe is attached to the src pad of the nvdsosd element after the output nvstreammux. The issue is that the metadata (e.g., bounding boxes) for a given source_id does not align with the corresponding frame, leading to incorrect overlays on the extracted frames.

Previous Success with Single PGIE

For a single PGIE with multiple streams, I resolved a similar issue using an osd_sink_probe (below) to filter metadata based on source_id and class_id:

def osd_sink_probe(self, pad, info, udata):
    buf = info.get_buffer()
    if not buf:
        return Gst.PadProbeReturn.OK

    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(buf))
    for l_frame in self.walk_nvlist(batch_meta.frame_meta_list):
        frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        if isinstance(self.allowed_sources, list):
            if frame_meta.source_id not in self.allowed_sources:
                l_obj = frame_meta.obj_meta_list
                while l_obj:
                    obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
                    next_obj = l_obj.next
                    pyds.nvds_remove_obj_meta_from_frame(frame_meta, obj_meta)
                    l_obj = next_obj
                continue
        # Additional class-based filtering logic here
    return Gst.PadProbeReturn.OK

Exploration of metamux Plugin

I explored the metamux plugin from the deepstream_reference_apps parallel inference example (deepstream-parallel-infer-app). However, I couldn’t get it to work correctly, and I noticed that NVIDIA’s documentation lacks Python-specific examples for metamux. The C-based example in the reference apps is complex, and I’m unsure how to adapt it to my Python pipeline. Has anyone successfully used metamux in a Python-based DeepStream pipeline to handle metadata in parallel inference scenarios?

Questions

  1. Metadata Mismatch: Why does the metadata get mismatched in the _frame_probe when extracting individual frames in a parallel pipeline? Is there a known issue with nvstreammux or nvdsosd in this context, or am I mishandling the batch_id/source_id in the probe?
  2. Fixing the Probe: How can I modify the _frame_probe to ensure metadata aligns correctly with the corresponding stream’s frame? Should I add additional filtering or use a different probe location?
  3. Metamux Guidance: Are there any Python-based examples or detailed documentation for using the metamux plugin in a parallel inference pipeline? If not, what’s the recommended approach to handle metadata multiplexing in Python?

Additional Notes

  • My pipeline is based on the _build_parallel_pipeline function, which creates per-stream inference chains with nvinfer and nvtracker elements, linked via nvstreammux and nvstreamdemux.
  • The tiled output (nvmultistreamtiler) works perfectly, showing correct metadata per stream.
  • I suspect the issue lies in how the output nvstreammux handles batch metadata, but I’m unsure how to debug or fix it.

Any insights or suggestions from the community would be greatly appreciated! Thank you.

You mentioned “build_parallel_pipeline” twice, what is it?

If you want to inference different streams with different models, you don’t need to put them in one pipeline. Please tell us your complete scenario.

Python and c++ are just different programming languages, you can use both languages to construct the same pipeline. Just like NVIDIA-AI-IOT/deepstream_python_apps: DeepStream SDK Python bindings and sample applications

What is the purpose to construct the pipeline in this way?

Hi Fiona,

Thank you for your response and for asking for clarification.

Clarification on build_parallel_pipeline

You mentioned that I referenced _build_parallel_pipeline twice and asked what it is. I apologize for any confusion. _build_parallel_pipeline is a custom function in my Python-based DeepStream pipeline that constructs a parallel inference pipeline using nvstreammux, nvstreamdemux, per-stream inference chains (nvinfer and nvtracker), and nvdsmetamux for metadata multiplexing. I mentioned it to indicate that I’m working with a parallel inference setup, but it’s not directly related to the core issue—it’s just the function where my pipeline is built. The code I shared in my original post is a simplified version of this function.

Complete Scenario

Let me explain my scenario in detail:

I have multiple camera sources (e.g., one file-based and one RTSP stream) and multiple inference models (e.g., PeopleNet and TrafficCamNet). My goal is to process these streams within a single DeepStream pipeline, where each stream can be associated with one or more models. For example, I want to run both PeopleNet and TrafficCamNet on one stream while running only TrafficCamNet on another. Here’s how I configure the stream-to-model mapping in my _configure_parallel_streams function:

def _configure_parallel_streams(self, sources: dict, default_pgie: str) -> dict:
    """Configure PGIE assignments per stream for parallel mode."""
    stream_pgie_map = {}
    
    for stream_id in sources.keys():
        if stream_id == 0:
            stream_pgie_map[stream_id] = ['pgie1', 'pgie2']  # Stream 0: PeopleNet + TrafficCamNet
        elif stream_id == 1:
            stream_pgie_map[stream_id] = ['pgie2']  # Stream 1: TrafficCamNet only
        elif stream_id == 2:
            stream_pgie_map[stream_id] = []  # Stream 2: No inference
        else:
            stream_pgie_map[stream_id] = ['pgie1']  # Default: PeopleNet
    return stream_pgie_map

Goals

  • Frame Extraction and Saving: I want to extract frames with bounding boxes drawn by nvdsosd for each stream using a probe on the nvdsosd sink pad. These frames should be saved to disk with source_id differentiation (e.g., frame_stream_0_xxx.jpg, frame_stream_1_xxx.jpg).
  • Correct Metadata Handling: The metadata (e.g., bounding boxes) from each stream’s inference model should be correctly associated with the respective stream without any mismatch or overlap.

Current Behavior

  • With Tiler: When I add nvmultistreamtiler after nvdsmetamux, i can see both streams displayed correctly in a tiled view. The bounding boxes from PeopleNet (stream 0) and TrafficCamNet (stream 1) are drawn as expected.
  • Without Tiler: When I remove nvmultistreamtiler , I face metadata issues:
    • Metadata from stream 1 (TrafficCamNet) appears on stream 0’s frames, or vice versa.
    • Only one stream’s frames are saved, depending on the active-pad setting in the nvdsmetamux config.
  • Single Model Case: When using a single model across all streams (without nvdsmetamux), I can process metadata correctly in the probe function. However, with multiple models, the metadata gets jumbled, leading to incorrect bounding boxes.

Specific Questions

  1. How can I configure nvdsmetamux to multiplex metadata from both streams so that frames saved via the nvdsosd probe have correct bounding boxes for each source_id?
  2. Without nvdsmetamux, can I filter metadata by source_id in the probe function to prevent mismatches?
  3. How should I integrate nvmultistreamtiler to display streams while saving individual frames correctly?
  4. Are there any Python-based nvdsmetamux examples I might have missed?

i can share:

  • The full pipeline DOT file if needed.
  • Detailed logs or probe debug output.

Thank you for your help! I look forward to your suggestions.

Best regards, Mert.

According to your scenario, seems the deepstream_reference_apps/deepstream_parallel_inference_app at master · NVIDIA-AI-IOT/deepstream_reference_apps pipeline can be a reference. The graph in the repo is not clear. Please refer to the graph I post here

Even the sample is implemented in c++, the pipeline can be referred. And you also need to refer to the configuration details in the sample. The only change you need is to use “nvstreamdemux” to replace “nvmultistreamtiler” after “nvmetamux”.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Checkout this (working/tested) solution:

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.