I am using Python with the DeepStream PyBindings. My input is a single tiled video file (1920x2160) from which I extract 2 video streams (using nvvidconv for cropping) and then merge them via an nvstreammux with batch size = 2, in order to emulate multiple sources coming from different cameras.
I tried to replicate the diagram shown in the example, but I encountered some issues.
Specifically, when 3 branches of the pipeline (except for the one performing inference) have queues connected directly to fakesink (as in the configuration below), the pipeline runs correctly in the sense that the first branch performs inference as expected.
I’m not sure what the issue is. I suspect that the tee is not creating 4 separate copies of the batch of 2 frames, but is simply forwarding it, so when one branch consumes the buffer, the others are blocked, causing the pipeline to stall. Is this correct?
How can I resolve this issue? What is the correct structure to use in order to have 4 models running in parallel in Python?
thanks a lot for your reply! After replicating the schema you suggested, everything is finally working as expected.
I still have two questions though:
I don’t quite understand why the version with multiple nvstreamdemux elements doesn’t work, while the one with a single nvstreamdemux does. I think it would be useful to clarify this in order to avoid similar mistakes in the future.
Why is one of the branches from the tee (right after the nvstreammux that connects all the sources) also fed into the metamux?
Originally, the nvstreammux and nvstreamdemux are not designed to support cascaded scenario. For parallel case, we have improved them to support it. For the plugins are proprietary, we can’t talk about the details.
The direct branch is used as the reference branch, the metadata from other branches will be merged into the refernce branch metadata.
I have one last question, still related to parallel execution and the use of metamux.
Basically, I need to synchronize the output of my 4 models, since I have to use the segmentation masks from all of them to perform calculations that depend on all four together.
If I add a probe function downstream of the metamux, is it guaranteed that it will always receive (via metadata) the 4 segmentation masks at once? Or do I need to introduce an additional synchronization mechanism — and if so, which one?
In addition, I noticed that segmentation metadata (pyds.NvDsInferSegmentationMeta) does not provide a unique_id attribute. This makes it difficult to identify which model generated a given mask, and leads to an error like:
AttributeError: 'pyds.NvDsInferSegmentationMeta' object has no attribute 'unique_id'
What is the recommended way to associate segmentation metadata with the corresponding model (e.g. via unique_id or another mechanism)?
The semantic segmentation model output mask is for the whole frame, it is attached as the NvDsInferSegmentationMeta in the user meta of the NvDsFrameMeta. Since you have four segmentation models, please assign different gie-unique-id to the different model in the nvinfer configuration file. After metamux, you can enumerate all NvDsInferSegmentationMeta from the frame_user_meta_list in NvDsFrameMeta.
The unique_id in NvDsInferSegmentationMeta will tell you which model generates this NvDsInferSegmentationMeta.
I’m already doing the following: I’ve assigned different gie-unique-ids and I’m using the probe function below to analyze semantic segmentation metadata:
def debug1_pad_buffer_probe(pad: Gst.Pad,
info: Gst.PadProbeInfo,
u_data: int) -> Gst.PadProbeReturn:
gst_buffer = info.get_buffer()
if not gst_buffer:
logging.error("Unable to get GstBuffer")
return
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list
while l_frame is not None:
try:
frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
except StopIteration:
break
l_user = frame_meta.frame_user_meta_list
while l_user is not None:
try:
seg_user_meta = pyds.NvDsUserMeta.cast(l_user.data)
except StopIteration:
break
if seg_user_meta and seg_user_meta.base_meta.meta_type == pyds.NVDSINFER_SEGMENTATION_META:
try:
segmeta = pyds.NvDsInferSegmentationMeta.cast(seg_user_meta.user_meta_data)
except StopIteration:
break
dt = datetime.fromtimestamp(frame_meta.ntp_timestamp / 1e9).strftime("%Y-%m-%d %H:%M:%S.%f")
logging.info(f"[FAKESINK] Stream: {frame_meta.pad_index}, Frame: {frame_meta.frame_num}, "
f"Timestamp: {dt}, PTS: {frame_meta.buf_pts}, SegmentationMeta: {segmeta.classes}")
print(segmeta.unique_id)
try:
l_user = l_user.next
except StopIteration:
break
try:
l_frame = l_frame.next
except StopIteration:
break
return Gst.PadProbeReturn.OK
However, I get the following error:
Traceback (most recent call last):
File “/opt/deepstream-app/app/main.py”, line 227, in debug1_pad_buffer_probe
print(segmeta.unique_id)
AttributeError: 'pyds.NvDsInferSegmentationMeta' object has no attribute 'unique_id'
Additionally, I’m not sure about how metamux works. If I attach a probe function to the source pad of the metamux, can I be certain that all 4 segmentation masks are available at that point? Some models run faster than others, so I’m concerned about synchronization.
Regarding inference synchronization, could you clarify whether I need to implement a separate mechanism (if yes, how?), or does the src pad of the metamux already provide all four inferences from the different branches synchronized? As I mentioned, some segmentation models are faster than others.
If the segmentation mask is for the frame content, why do you think the mask will not be synchronized with the frame? I’m confused. What do you want to synchronize?
I want to synchronize the segmentation masks from four models. My input file is a 2160x1920 .h265 video, which I split into two streams, 1080x1920 each (called A and B). Stream A is fed into two segmentation models, X and Y, while stream B is fed into two other segmentation models, W and Z. These models have different inference speeds, and all branches are eventually merged at a metamux.
I would like to understand: if I attach a probe function to the src pad of the metamux, is it guaranteed that all inferences will be ready when the function is invoked? Or is it possible that some inferences are ready while others are not yet complete, meaning I cannot perform the calculations I need because the segmentation mask from the slowest model is still missing?