Environment
- DeepStream Version: 7.1
- OS: Ubuntu 22.04
- GPU: NVIDIA RTX 3090 (24 GB)
- Driver Version: 535.230.02
- CUDA Version: 12.2
- CPU: Intel i9 14th Gen
- RAM: 128 GB
- Python: 3.10.12 (with pyds bindings)
Problem Description
I’m implementing a parallel inference pipeline using DeepStream 7.1, inspired by the build_parallel_pipeline approach, where multiple streams are processed with different PGIE models assigned per stream. The pipeline uses nvstreammux, nvstreamdemux, per-stream inference chains (with nvinfer and nvtracker), and a second nvstreammux for output remultiplexing. For visualization in development mode, I use nvmultistreamtiler to display all streams, and the metadata (bounding boxes, etc.) is correctly rendered on the tiled output.
However, when I add a frame probe on the nvdsosd source pad to extract individual source frames and metadata (using the _frame_probe function below), I encounter a metadata mismatch issue. Specifically, bounding boxes detected for one stream (e.g., stream 2) are incorrectly displayed on another stream (e.g., stream 1). This issue does not occur in the tiled output, only when extracting frames via the probe.
Pipeline Structure
The parallel pipeline is constructed as follows:
- Input Multiplexer: nvstreammux (input_mux) receives multiple input streams (e.g., from file sources).
- Stream Demultiplexer: nvstreamdemux splits the batched stream into individual streams based on source_id.
- Per-Stream Inference Chains:
- Each stream has a queue to handle buffering.
- One or more nvinfer (PGIE) elements per stream, configured with different models (e.g., PeopleNet, TrafficCamNet) and unique unique-id properties.
- An nvtracker per stream for object tracking.
- Output Multiplexer: A second nvstreammux (output_mux) recombines the processed streams.
- Post-Processing:
- nvvideoconvert (conv_post) for format conversion.
- In development mode, nvmultistreamtiler for tiled display (rows and columns based on number of sources).
- capsfilter to enforce RGBA format.
- nvdsosd (osd_parallel) for rendering bounding boxes and text.
- Display sink: nveglglessink (or nv3dsink on Jetson) in dev mode, or fakesink otherwise.
- Probe Location: The frame probe is attached to the src pad of nvdsosd to extract individual frames and metadata.
The pipeline structure can be summarized as:
[source(s)] -> nvstreammux -> nvstreamdemux -> [queue -> nvinfer -> nvtracker] -> nvstreammux -> [nvvideoconvert -> capsfilter -> nvdsosd -> sink]
Relevant Probe Code
Here’s the simplified _frame_probe function used to extract frames and metadata:
def _frame_probe(self, pad, info, u_data):
gst_buf = info.get_buffer()
if gst_buf is None:
return Gst.PadProbeReturn.OK
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buf))
if batch_meta is None:
return Gst.PadProbeReturn.OK
l_frame = batch_meta.frame_meta_list
while l_frame:
fmeta = pyds.NvDsFrameMeta.cast(l_frame.data)
sid = fmeta.source_id
bid = fmeta.batch_id
# Extract GPU surface for individual source
dtype, shape, strides, dataptr, size = pyds.get_nvds_buf_surface_gpu(hash(gst_buf), bid)
frame_gpu = cp.ndarray(shape, dtype=dtype, memptr=cp.cuda.MemoryPointer(cp.cuda.UnownedMemory(dataptr, size, owner=gst_buf), 0), strides=strides, order='C')
frame_cpu = cp.asnumpy(frame_gpu)
frame_bgr = cv2.cvtColor(frame_cpu, cv2.COLOR_RGBA2BGR)
# Store frame per source ID
self.result_frames[sid] = frame_bgr
if self.frame_cb:
meta = {"frame_num": fmeta.frame_num, "num_objects": fmeta.num_obj_meta}
self.frame_cb(sid, frame_bgr, meta)
l_frame = l_frame.next
return Gst.PadProbeReturn.OK
This probe is attached to the src pad of the nvdsosd element after the output nvstreammux. The issue is that the metadata (e.g., bounding boxes) for a given source_id does not align with the corresponding frame, leading to incorrect overlays on the extracted frames.
Previous Success with Single PGIE
For a single PGIE with multiple streams, I resolved a similar issue using an osd_sink_probe (below) to filter metadata based on source_id and class_id:
def osd_sink_probe(self, pad, info, udata):
buf = info.get_buffer()
if not buf:
return Gst.PadProbeReturn.OK
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(buf))
for l_frame in self.walk_nvlist(batch_meta.frame_meta_list):
frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
if isinstance(self.allowed_sources, list):
if frame_meta.source_id not in self.allowed_sources:
l_obj = frame_meta.obj_meta_list
while l_obj:
obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
next_obj = l_obj.next
pyds.nvds_remove_obj_meta_from_frame(frame_meta, obj_meta)
l_obj = next_obj
continue
# Additional class-based filtering logic here
return Gst.PadProbeReturn.OK
Exploration of metamux Plugin
I explored the metamux plugin from the deepstream_reference_apps parallel inference example (deepstream-parallel-infer-app). However, I couldn’t get it to work correctly, and I noticed that NVIDIA’s documentation lacks Python-specific examples for metamux. The C-based example in the reference apps is complex, and I’m unsure how to adapt it to my Python pipeline. Has anyone successfully used metamux in a Python-based DeepStream pipeline to handle metadata in parallel inference scenarios?
Questions
- Metadata Mismatch: Why does the metadata get mismatched in the _frame_probe when extracting individual frames in a parallel pipeline? Is there a known issue with nvstreammux or nvdsosd in this context, or am I mishandling the batch_id/source_id in the probe?
- Fixing the Probe: How can I modify the _frame_probe to ensure metadata aligns correctly with the corresponding stream’s frame? Should I add additional filtering or use a different probe location?
- Metamux Guidance: Are there any Python-based examples or detailed documentation for using the metamux plugin in a parallel inference pipeline? If not, what’s the recommended approach to handle metadata multiplexing in Python?
Additional Notes
- My pipeline is based on the _build_parallel_pipeline function, which creates per-stream inference chains with nvinfer and nvtracker elements, linked via nvstreammux and nvstreamdemux.
- The tiled output (nvmultistreamtiler) works perfectly, showing correct metadata per stream.
- I suspect the issue lies in how the output nvstreammux handles batch metadata, but I’m unsure how to debug or fix it.
Any insights or suggestions from the community would be greatly appreciated! Thank you.
