Multisession inference, segmentation

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Could someone please provide some guidance, how I could achieve this?

My use case is more or less fully described by the Python sample deepstream-rtsp-in-rtsp-out. That works great with one input stream on a T4 AWS instance (and on Jetson Nano).

Now I would like to move this solution to a “multi-session” solution. The idea is to utilize the GPU for various completely independent RTSP input streams and do the inference for each of them separately. I didn’t just try to launch my Python script in yet another instance yet, but I suppose, this will not work either.

Do you have some guidance, what I would have to do to achieve this?

The solution should later be enhanced to do not only inference, but maybe also segmentation in parallel.

Sorry, if this is a stupid question and the solution is obvious.

Do you mean using multiple sources? You can just use the command below deepstream-rtsp-in-rtsp-out.

python3 deepstream_test1_rtsp_in_rtsp_out.py \
-i rtsp://xxx1 rtsp://xx2 rtsp://xxx3  -g nvinfer

No, not like so. I’m aware of this. I don’t want to merge multiple sources and make inference over it all.

I have multiple RTSP sources, yes. Particularly those are flying drones. They are pushing their video to a server via WebRTC. AI is pulling it on demand from that server via RTSP and pushes the inference/segmentation results back to the server via RTPS too (the rtsp-in-out thing). Then clients can view both, the original input and the annotated inference result video via WebRTC (you wouldn’t believe how performant that works)

But today only with just one drone. Now I need to multiply that…

You can just merge multiple sources and make inference over it all, then use the nvstreamdemux to demuxes batched frames into individual buffers.

Really? But this would for sure require me to have all the input streams available once I start the inference, right?

Is there a python sample for demonstrating that?

You can refer to this deepstream-demux-multi-in-multi-out.

Yes, this generally works, but it has the downside, that all input streams need to be necessary at start time of inference.

So say I would be building a system, capable of working with 30 input channels I would have to start with all 30 channels available when the inference starts I could imagine to feed those input, which do not have an active RTSP input available, are fed with say “videotestsrc” or so, just to have something.

Then the RTSP source becomes available and would need to replace the “videotestsrc” for a given input channel. And the reverse back, once the RTSP input channel goes away.

Any idea, how this could be achieved? input_selector?

You need to refer to our code and customize your own needs. About how to add the source dynamically, you can refer to runtime_source_add_delete.

That seems to be a very useful hint. Thank you very much, going to test that.

One additional question, though. Say, it would be possible to add/remove sources dynamically (not much doubt, since you have a sample for it): What about the demultiplexing? I would like to treat each input as completely separated from the others, hence I need to have an output demultiplexer, which is

a) always there and capable of handling the maximum of input sources
b) added dynamically too

For both scenarios I have some doubts, if that will work. What do you think?

You can use our nvstreamdemux as I attached before. But no similar demo is currently available for adding the src_pad dynamically of that plugin. You need to implement yourself currently.

OK, thanks. Let’s see

Good, I’m stuck. I’m finally trying to get a request src pad of the demuxer, to no avail. The demuxer is there, but this returns “None” all the time

demuxer = pipeline.get_by_name("demux")
print("demuxer", demuxer)
demux_src_pad = demuxer.request_pad_simple("src_%u")
print("demux_src_pad", demux_src_pad)
new_queue_sink_pad = new_queue.get_static_pad("sink")
demux_src_pad.link(new_queue_sink_pad)

Returns:

demuxer <__gi__.GstNvStreamDemux object at 0x7dcea932c300 (GstNvStreamDemux at 0x5722a29bb000)>
demux_src_pad None
Traceback (most recent call last):
  File "/home/ubuntu/vx-ai/inference/test.py", line 56, in <module>
    demux_src_pad.link(new_queue_sink_pad)
AttributeError: 'NoneType' object has no attribute 'link'

The aim is to come from here

demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

to here, but programmatically:

demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink
demux.src_1 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

Any pointer appreciated.

Yo should not use this to get the src pad. Please refer to the source code I attached carefully padname = “src_%u” % i.

Yes, I know. I initially started with “src_1”. Didn’t work either.

Good. Here is my script so far, if you would be so kind to check that out:

  • It starts with one source and one output to fakesink. If you download the “ny.mp4” from here ny.mp4 then you would also have a longer video

  • Then I’m waiting for a key press and the intention is to add another source and destination. Since I can’t overcome the problem with not being able to obtain a demux request source pad I can’t tell if the rest following does work

Basically it is an attempt to come from pipeline 1 to pipeline 2, just after the keypress. These two pipelines work fine on its own:

gst-launch-1.0 nvstreammux name=mux batch-size=2 width=1280 height=720 ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=2 ! \
nvstreamdemux name=demux \
uridecodebin uri=file:///home/ubuntu/ny.mp4 ! mux.sink_0 \
demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

gst-launch-1.0 nvstreammux name=mux batch-size=2 width=1280 height=720 ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=2 ! \
nvstreamdemux name=demux \
uridecodebin uri=file:///home/ubuntu/ny.mp4  ! mux.sink_0 \
uridecodebin uri=file:///home/ubuntu/ny.mp4  ! mux.sink_1 \
demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink \
demux.src_1 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

What am I doing wrong?

import gi
gi.require_version('Gst', '1.0')
from gi.repository import Gst, GObject

Gst.init(None)

pipeline_str = """
    nvstreammux name=mux batch-size=2 width=1280 height=720 !
    nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=2 !
    nvstreamdemux name=demux 
    uridecodebin uri=file:///home/ubuntu/ny.mp4 ! mux.sink_0
    demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink
"""

pipeline = Gst.parse_launch(pipeline_str)
pipeline.set_state(Gst.State.PLAYING)

# Wait for user input to add another source and a new sink to the pipeline
input("Press Enter to add another source and a new sink to the pipeline...\n")

# Dynamically create and add another source to the pipeline
new_source = Gst.ElementFactory.make("uridecodebin", None)
new_source.set_property("uri", "file:///home/ubuntu/ny.mp4")

# Create SourceBin and GhostPad
index = 1
source_bin = Gst.Bin.new("source-bin-%u" % index)
Gst.Bin.add(source_bin, new_source)
result = source_bin.add_pad(Gst.GhostPad.new_no_target("src", Gst.PadDirection.SRC))
print("source_bin.add_pad", result)

# Add source-bin to pipeline
pipeline.add(source_bin)

# Link with mux
mux = pipeline.get_by_name("mux")
print("mux", mux)
sink_pad = mux.request_pad_simple("sink_%u" % index)
print("sink_pad", sink_pad)
source_pad = source_bin.get_static_pad("src")
print("source_pad", source_pad)
source_pad.link(sink_pad)

# Dynamically create and add a new sink to the pipeline
new_sink = Gst.ElementFactory.make("fakesink", None)
pipeline.add(new_sink)

# Dynamically create and add queue, nvosd and encoder elements
new_queue = Gst.ElementFactory.make("queue", None)
new_nvosd = Gst.ElementFactory.make("nvdsosd", None)
new_encoder = Gst.ElementFactory.make("nvv4l2h264enc", None)

# Add new elements to the pipeline
pipeline.add(new_queue)
pipeline.add(new_nvosd)
pipeline.add(new_encoder)

# Link the new queue to the demuxer
demux = pipeline.get_by_name("demux")
print("demux", demux)
demux_src_pad = demux.request_pad_simple("src_%u" % index)
print("demux_src_pad", demux_src_pad)

new_queue_sink_pad = new_queue.get_static_pad("sink")
demux_src_pad.link(new_queue_sink_pad)

# Link the elements after the new queue
new_queue.link(new_nvosd)
new_nvosd.link(new_encoder)
new_encoder.link(new_sink)

# Wait for user input to stop the pipeline
input("Press Enter to stop the pipeline...\n")

# Stop the pipeline
pipeline.set_state(Gst.State.NULL)

I can’t make this work

demux_src_pad = demux.request_pad_simple("src_%u" % index)

Hmm. I realized, that I maybe need the callbacks cb_newpad too…

But that’s not the point for not getting the request source. Here we have it:

nvstreamdemux gstnvstreamdemux.cpp:132:gst_nvstreamdemux_request_new_pad:<demux> New pad can only be requested in NULL state

End of lane, because that’s the difference between my use case and your sample. You are constructing the pipeline completely before starting it. This allows you to obtain a request src pad. My pipeline is already running (by intention) and I “just” want to add yet another source/destination.

Already found here How to add source and sink with nvstreammux and nvstreamdemux dynamically - #6 by 2251582984

OK, having it. At least the dynamic add so far. Three things:

  • pre-create all possible nvstreamdemux request sources before starting the pipeline as elaborated in the post I quoted above (THANKS!!!)
  • handle “pad-added” for the source_bin
  • source_bin.sync_state_with_parent()

However, this kind of input/output handling introduces an overall latency, I’m not used to see with DeepStream. My (single stream) solution - RTSP-in → inference → RTSP out - shows up with a 400 ms end-to-end latency.

The latency I’m seeing with this solution is about 5 seconds, which is inacceptable.

You can test by yourself: Have an RTSP server somewhere and launch this pipeline:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! \
nvstreamdemux name=demux \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0 \
demux.src_0 ! nvdsosd ! nvv4l2h264enc bitrate=4000000 ! rtspclientsink location=rtsp://your-server:8554/output-stream 

Then feed the RTSP server for input_stream with a camera or something. Consume the output-stream with FFPLAY. Check the latency.

Then launch this pipeline and check the wayyyyy lower latency:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! \
nvdsosd ! nvv4l2h264enc bitrate=4000000 ! rtspclientsink location=rtsp://your-server:8554/output-stream  \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0

The only difference between these two runs is the absence of nvstreamdemux in the latter case.

I couldn’t find any useful configuration on any element which would make the latency of the “nvstreamdemux” solutions lower.