Multisession inference, segmentation

foreverneilyoung · April 8, 2024, 12:01pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Could someone please provide some guidance, how I could achieve this?

My use case is more or less fully described by the Python sample deepstream-rtsp-in-rtsp-out. That works great with one input stream on a T4 AWS instance (and on Jetson Nano).

Now I would like to move this solution to a “multi-session” solution. The idea is to utilize the GPU for various completely independent RTSP input streams and do the inference for each of them separately. I didn’t just try to launch my Python script in yet another instance yet, but I suppose, this will not work either.

Do you have some guidance, what I would have to do to achieve this?

The solution should later be enhanced to do not only inference, but maybe also segmentation in parallel.

Sorry, if this is a stupid question and the solution is obvious.

yuweiw · April 9, 2024, 5:26am

Do you mean using multiple sources? You can just use the command below deepstream-rtsp-in-rtsp-out.

python3 deepstream_test1_rtsp_in_rtsp_out.py \
-i rtsp://xxx1 rtsp://xx2 rtsp://xxx3  -g nvinfer

foreverneilyoung · April 9, 2024, 5:53am

No, not like so. I’m aware of this. I don’t want to merge multiple sources and make inference over it all.

I have multiple RTSP sources, yes. Particularly those are flying drones. They are pushing their video to a server via WebRTC. AI is pulling it on demand from that server via RTSP and pushes the inference/segmentation results back to the server via RTPS too (the rtsp-in-out thing). Then clients can view both, the original input and the annotated inference result video via WebRTC (you wouldn’t believe how performant that works)

But today only with just one drone. Now I need to multiply that…

yuweiw · April 9, 2024, 11:10am

You can just merge multiple sources and make inference over it all, then use the nvstreamdemux to demuxes batched frames into individual buffers.

foreverneilyoung · April 9, 2024, 11:14am

Really? But this would for sure require me to have all the input streams available once I start the inference, right?

Is there a python sample for demonstrating that?

yuweiw · April 10, 2024, 5:55am

You can refer to this deepstream-demux-multi-in-multi-out.

foreverneilyoung · April 15, 2024, 11:53am

Yes, this generally works, but it has the downside, that all input streams need to be necessary at start time of inference.

So say I would be building a system, capable of working with 30 input channels I would have to start with all 30 channels available when the inference starts I could imagine to feed those input, which do not have an active RTSP input available, are fed with say “videotestsrc” or so, just to have something.

Then the RTSP source becomes available and would need to replace the “videotestsrc” for a given input channel. And the reverse back, once the RTSP input channel goes away.

Any idea, how this could be achieved? input_selector?

yuweiw · April 16, 2024, 6:23am

You need to refer to our code and customize your own needs. About how to add the source dynamically, you can refer to runtime_source_add_delete.

foreverneilyoung · April 16, 2024, 6:42am

That seems to be a very useful hint. Thank you very much, going to test that.

foreverneilyoung · April 16, 2024, 7:21am

One additional question, though. Say, it would be possible to add/remove sources dynamically (not much doubt, since you have a sample for it): What about the demultiplexing? I would like to treat each input as completely separated from the others, hence I need to have an output demultiplexer, which is

a) always there and capable of handling the maximum of input sources
b) added dynamically too

For both scenarios I have some doubts, if that will work. What do you think?

yuweiw · April 16, 2024, 10:44am

You can use our nvstreamdemux as I attached before. But no similar demo is currently available for adding the src_pad dynamically of that plugin. You need to implement yourself currently.

foreverneilyoung · April 16, 2024, 10:55am

OK, thanks. Let’s see

foreverneilyoung · April 18, 2024, 7:32am

Good, I’m stuck. I’m finally trying to get a request src pad of the demuxer, to no avail. The demuxer is there, but this returns “None” all the time

demuxer = pipeline.get_by_name("demux")
print("demuxer", demuxer)
demux_src_pad = demuxer.request_pad_simple("src_%u")
print("demux_src_pad", demux_src_pad)
new_queue_sink_pad = new_queue.get_static_pad("sink")
demux_src_pad.link(new_queue_sink_pad)

Returns:

demuxer <__gi__.GstNvStreamDemux object at 0x7dcea932c300 (GstNvStreamDemux at 0x5722a29bb000)>
demux_src_pad None
Traceback (most recent call last):
  File "/home/ubuntu/vx-ai/inference/test.py", line 56, in <module>
    demux_src_pad.link(new_queue_sink_pad)
AttributeError: 'NoneType' object has no attribute 'link'

The aim is to come from here

demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

to here, but programmatically:

demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink
demux.src_1 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

Any pointer appreciated.

yuweiw · April 18, 2024, 8:31am

Yo should not use this to get the src pad. Please refer to the source code I attached carefully padname = “src_%u” % i.

foreverneilyoung · April 18, 2024, 11:31am

Yes, I know. I initially started with “src_1”. Didn’t work either.

foreverneilyoung · April 18, 2024, 12:18pm

Good. Here is my script so far, if you would be so kind to check that out:

It starts with one source and one output to fakesink. If you download the “ny.mp4” from here ny.mp4 then you would also have a longer video
Then I’m waiting for a key press and the intention is to add another source and destination. Since I can’t overcome the problem with not being able to obtain a demux request source pad I can’t tell if the rest following does work

Basically it is an attempt to come from pipeline 1 to pipeline 2, just after the keypress. These two pipelines work fine on its own:

gst-launch-1.0 nvstreammux name=mux batch-size=2 width=1280 height=720 ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=2 ! \
nvstreamdemux name=demux \
uridecodebin uri=file:///home/ubuntu/ny.mp4 ! mux.sink_0 \
demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

gst-launch-1.0 nvstreammux name=mux batch-size=2 width=1280 height=720 ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=2 ! \
nvstreamdemux name=demux \
uridecodebin uri=file:///home/ubuntu/ny.mp4  ! mux.sink_0 \
uridecodebin uri=file:///home/ubuntu/ny.mp4  ! mux.sink_1 \
demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink \
demux.src_1 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink

What am I doing wrong?

import gi
gi.require_version('Gst', '1.0')
from gi.repository import Gst, GObject

Gst.init(None)

pipeline_str = """
    nvstreammux name=mux batch-size=2 width=1280 height=720 !
    nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=2 !
    nvstreamdemux name=demux 
    uridecodebin uri=file:///home/ubuntu/ny.mp4 ! mux.sink_0
    demux.src_0 ! queue ! nvdsosd ! nvv4l2h264enc ! fakesink
"""

pipeline = Gst.parse_launch(pipeline_str)
pipeline.set_state(Gst.State.PLAYING)

# Wait for user input to add another source and a new sink to the pipeline
input("Press Enter to add another source and a new sink to the pipeline...\n")

# Dynamically create and add another source to the pipeline
new_source = Gst.ElementFactory.make("uridecodebin", None)
new_source.set_property("uri", "file:///home/ubuntu/ny.mp4")

# Create SourceBin and GhostPad
index = 1
source_bin = Gst.Bin.new("source-bin-%u" % index)
Gst.Bin.add(source_bin, new_source)
result = source_bin.add_pad(Gst.GhostPad.new_no_target("src", Gst.PadDirection.SRC))
print("source_bin.add_pad", result)

# Add source-bin to pipeline
pipeline.add(source_bin)

# Link with mux
mux = pipeline.get_by_name("mux")
print("mux", mux)
sink_pad = mux.request_pad_simple("sink_%u" % index)
print("sink_pad", sink_pad)
source_pad = source_bin.get_static_pad("src")
print("source_pad", source_pad)
source_pad.link(sink_pad)

# Dynamically create and add a new sink to the pipeline
new_sink = Gst.ElementFactory.make("fakesink", None)
pipeline.add(new_sink)

# Dynamically create and add queue, nvosd and encoder elements
new_queue = Gst.ElementFactory.make("queue", None)
new_nvosd = Gst.ElementFactory.make("nvdsosd", None)
new_encoder = Gst.ElementFactory.make("nvv4l2h264enc", None)

# Add new elements to the pipeline
pipeline.add(new_queue)
pipeline.add(new_nvosd)
pipeline.add(new_encoder)

# Link the new queue to the demuxer
demux = pipeline.get_by_name("demux")
print("demux", demux)
demux_src_pad = demux.request_pad_simple("src_%u" % index)
print("demux_src_pad", demux_src_pad)

new_queue_sink_pad = new_queue.get_static_pad("sink")
demux_src_pad.link(new_queue_sink_pad)

# Link the elements after the new queue
new_queue.link(new_nvosd)
new_nvosd.link(new_encoder)
new_encoder.link(new_sink)

# Wait for user input to stop the pipeline
input("Press Enter to stop the pipeline...\n")

# Stop the pipeline
pipeline.set_state(Gst.State.NULL)

I can’t make this work

demux_src_pad = demux.request_pad_simple("src_%u" % index)

foreverneilyoung · April 18, 2024, 12:54pm

Hmm. I realized, that I maybe need the callbacks cb_newpad too…

foreverneilyoung · April 18, 2024, 1:13pm

But that’s not the point for not getting the request source. Here we have it:

nvstreamdemux gstnvstreamdemux.cpp:132:gst_nvstreamdemux_request_new_pad:<demux> New pad can only be requested in NULL state

End of lane, because that’s the difference between my use case and your sample. You are constructing the pipeline completely before starting it. This allows you to obtain a request src pad. My pipeline is already running (by intention) and I “just” want to add yet another source/destination.

foreverneilyoung · April 18, 2024, 1:17pm

Already found here How to add source and sink with nvstreammux and nvstreamdemux dynamically - #6 by 2251582984

foreverneilyoung · April 18, 2024, 3:10pm

OK, having it. At least the dynamic add so far. Three things:

pre-create all possible nvstreamdemux request sources before starting the pipeline as elaborated in the post I quoted above (THANKS!!!)
handle “pad-added” for the source_bin
source_bin.sync_state_with_parent()

However, this kind of input/output handling introduces an overall latency, I’m not used to see with DeepStream. My (single stream) solution - RTSP-in → inference → RTSP out - shows up with a 400 ms end-to-end latency.

The latency I’m seeing with this solution is about 5 seconds, which is inacceptable.

You can test by yourself: Have an RTSP server somewhere and launch this pipeline:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! \
nvstreamdemux name=demux \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0 \
demux.src_0 ! nvdsosd ! nvv4l2h264enc bitrate=4000000 ! rtspclientsink location=rtsp://your-server:8554/output-stream

Then feed the RTSP server for input_stream with a camera or something. Consume the output-stream with FFPLAY. Check the latency.

Then launch this pipeline and check the wayyyyy lower latency:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! \
nvdsosd ! nvv4l2h264enc bitrate=4000000 ! rtspclientsink location=rtsp://your-server:8554/output-stream  \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0

The only difference between these two runs is the absence of nvstreamdemux in the latter case.

I couldn’t find any useful configuration on any element which would make the latency of the “nvstreamdemux” solutions lower.

Topic		Replies	Views
Huge delay performance between python script and deepstream-app DeepStream SDK	12	476	February 22, 2024
Output Multiple RTSP Streams DeepStream SDK	29	1734	June 26, 2023
New NvStreammux shows 「[ERROR push 317] push failed [-5]」 DeepStream SDK	9	571	June 4, 2024
Deepstream_test_1.py doesn`t work DeepStream SDK	23	1493	December 12, 2022
Why doesn't this pipeline work? NvMMLiteBlockCreate : Block : BlockType = 279 DeepStream SDK	27	3030	April 27, 2020
TX2 H264 RTSP Stream decoding issues Jetson TX2	27	11544	October 18, 2021
DeepStream, Python USB cam sample hangs DeepStream SDK	38	1353	October 12, 2021
Reconnection Issue DeepStream SDK	40	1475	January 25, 2024
Jetson goes curling (or, simultaneously viewing multiple IP-cams) Jetson TK1	33	12047	November 28, 2014
Nvv4l2decoder driver bug error on Tesla T4 DeepStream SDK	11	1626	October 12, 2021

Multisession inference, segmentation

Related topics