Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
GPU • DeepStream Version
6.2 • JetPack Version (valid for Jetson only) • TensorRT Version • NVIDIA GPU Driver Version (valid for GPU only) • Issue Type( questions, new requirements, bugs)
Problem linking/playing a Pipeline on dGPU with two DeepStream demuxer plugins… seems to work well on Jetson.
I have a user of my DSL work here… that is trying to build a multi-source Pipeline that splits the batched output stream with a Tee so that each new batched stream can be processed differently. After processing, each batched stream is to be demuxed so that the individual streams can be rendered, encoded, or dropped… depending on the downstream sink that is linked to each demuxed branch.
We’ve simplified the Pipeline down to a single URI-Source Bin, Streammuxer, Tee, 2-Demuxers and 2-Window-Sink Bins.
The Pipeline links and plays well on Jetson, but fails on dGPU.
Here is the Pipeline graph for dGPU. I’ve cutoff the source-bin to minimize the size of the file. You can see clearly that the top branch connected to the first demuxer is not linked correctly, however, the lower (identical) branch is.
@fanzh the examples use only a single demuxer which I have no problem with… as mentioned above, the problem is specific to running two demuxers from a split-batched stream on dGPU with Window-sinks.
Again, single demuxer on dGPU and Jetson… no problem
Two demuxers with split stream on Jetson… no problem
Two demuxers with split stream on dGPU… one of the Demuxers fails to link correctly.
And again, this might very well be do to my Window Sink bin implementation that is not negotiating caps correctly with the demuxer upstream… and not a demuxer issue.
Thanks,
Robert.
p.s. perhaps you’re confused by all of the pre-allocated request pads for each demuxer in the image above?? This is done to support the dynamic addition/removal of branches while the Pipeline is playing… This logic all works correctly, it is just the double demuxer link issue that is holding us back.
from the pipeline picture on dgpu, one sink negotiated while the other did not negotiated.
could you share more logs? Thanks! both working and no working logs. please do “export GST_DEBUG=6” first to modify Gstreamer’s log level, then run again, you can redirect the logs to a file.
could you provide a simplified code to reproduce this issue? Thanks!
the two files above are the same.
there is an error “0:00:13.607738749 e[31m230676e[00m 0x1b82800 e[31;01mERROR e[00m e[00m DSL src/DslNodetr.h:958:SetState:e[00m : FAILURE occured waiting for state to change to ‘PLAYING’ for GstNotetr ‘pipeline1’” is this DslNodetr.h your custom code?
The user @446073615 – who is using my DeepStream Services Library – wants to do a lot more than just view each. The Pipeline was just simplified to show the problem.
Yes, parallel inference is some of what the user is trying to do. I will refer to the link… on first look, I see that that each demuxer is linked directly to another streammuxer vs. individual branches… interesting.
Just FYI, we tried with other types of sink bins… fake-sink, file-sink, rtsp-sink… same issue.
As you can see, it is far simpler than the deepstream_parallel_inference_app examples linked above.
I managed to hack up the deepstream_test3_app to run with a single uri-source, nvstreammux, tee, 2-nvstreamdemuxs, and 2-sinks, one connected to each demuxer. There is no inference or osd.
as you know, tee will not duplicate gstbuffer, and only send gstbuffer to each branch by turn. the effect of your pipeline is the same with “nvstreammux + yolov4 + bodypose”, there is no parallel inferencing effect.
if queues are added after tee, multiple branchs will processing the same gstbuffer at the same time, there will be a problem of multiple threads operating.
@fanzh sorry but your remarks do not make any sense to me… there are plenty of deepstream examples that split/duplicate the stream after the osd with a tee so that one can both render as well as encode to file. How is this any different than splitting a batched stream??
All of the examples for the parallel inference use a tee to duplicate the stream ??? What am I missing here.
re: if queues are added after tee, multiple branchs will processing the same gstbuffer at the same time, there will be a problem of multiple threads operating.
This is the basic rule for using a tee… from the GStreamer documentation… “One needs to use separate queue elements (or a multiqueue) in each branch to provide separate threads for each branch. Otherwise a blocked dataflow in one branch would stall the other branches.” … see tee
@fanzh please forgive my confusion… after reviewing the gstreamer tee implementation I now have a much better understanding… and yes clearly the same buffers are being pushed to all pads with the buffer reference count updated accordingly (actually, I must have known this, thinking back to the secondary inference graph implementation).
Anyway, I now see why the extra demuxers are needed and why each is connected to an additional Streamuxer.
Similar to how the deepstream_app is implemented to support multiple sink bins.
I’m assuming that, although the tee is pushing the same buffer to multiple branches, each branch can convert the buffers (using the nvvideoconvert plugin) to a different format and scale as needed… and that the nvvideoconvert plugin is making a copy of the buffer before transposing?