Multisession inference, segmentation

I try the pipeline below, there is not much difference between them. Could you try that?
with nvstreamdemux, use nveglglessink:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! nvstreamdemux name=demux demux.src_0 ! nvvideoconvert ! nvdsosd ! nveglglessink uridecodebin uri=rtsp://xxx ! mux.sink_0

No nvstreamdemux, use nveglglessink :

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! nvvideoconvert  ! nvdsosd ! nveglglessink uridecodebin uri=rtsp://xxx ! mux.sink_0

But you can confirm, that there is s significant difference with the RTSP output between both pipelines I provided?

No nvstreamdemux, use nveglglessink :

I could, but this is not my use case. I need RTSP in - RTSP out. Any suggestions for this?

BTW I don’t have a display, it is running on AWS

But other than that we must have a different understanding of what “non acceptable latency” is.

I was able to run your two pipelines on my Jetson Nano and there is definitely a higher latency with the nvstreamdemux pipeline. Not factor 5 but at least 3 times higher.

I think, if we don’t agree on the fact, that the nvstreamdemux introduces, additional, not negligible latency, we won’t make any progress in this matter and this solution is ruled out for me.

If it would still need more evidence, here it is.

I’m pushing a NY video from my PC to the RTSP server. The two inference pipelines - the one without nvstreamdemux and with lower latency and the other - are pulling the video from the RTSP server and pushing the inference results back as another stream.

Then on my PC I’m pulling both - the original and the annotated stream - and displaying it side by side

The pipeline with the lower latency and without nvstreamdemux:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! \
nvdsosd ! nvv4l2h264enc bitrate=4000000 ! rtspclientsink location=rtsp://your-server:8554/output-stream  \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0

The pipeline with the higher latency and with nvstreamdemux:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! \
nvstreamdemux name=demux \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0 \
demux.src_0 ! nvdsosd ! nvv4l2h264enc bitrate=4000000 ! rtspclientsink location=rtsp://your-server:8554/output-stream 

The results in video:

Lower latency:

Higher latency:

I would estimate the latency addition caused by nvstreamdemux to about 2 seconds. For what reasons ever.

That is really, really weird…

I have now enabled display-clock for nvdosd. With both pipelines I see a general latency of 1 s, which would be ok. One second means: I have a real-time clock aside here and my camera image going through the inference. The timestamp in the output video of the inference is 1 second behind. Good so far and completely acceptable.

When I for instance raise my hand in the (real-time) second 35 of a minute, then the inference display shows that move with exactly the timestamp 35 one second later. But only, if nvstreamdemux is not in the pipeline. So the timestamped video really shows the real time situation at that time. Perfect.

But with nvstreamdemux in the pipeline, the movement is shown after a delay of totally 3 seconds and the timestamp in the video is 38 for the movement. Note: The total timestamp difference between the video seen and the real time clock is still 1 second, it is just so, that the video - correctly timestamped - is in reality 2 seconds older.

Conclusion: The demuxer is somehow needing additional 2 seconds to feed the video to the nvdsod, where it gets timestamped correctly. But it has been artifically aged. :)

And there is no configurable way around. This component is useless for real-time applications.

Good, I think I have provided as much info as necessary. Finally, here the last bit: A simplified pipeline, inference and osd removed:

1 second delay:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvv4l2h264enc ! rtspclientsink location=rtsp://your-server:8554/output-stream \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0

3 seconds delay:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvstreamdemux name=demux demux.src_0 ! \
nvv4l2h264enc ! rtspclientsink location=rtsp://your-server:8554/output-stream \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0 

Just why in god’s name? :)

Due to the weekend and the time difference, we may be slow to reply. Referring to the pipeline you provided, we are trying to reproduce this issue on our side and analyzing it further.
On my side, according to the pipeline you provided, the latency is about 1s with and without the nvstreamdemux plugin.

left: with Nvstreamdemux right:without nvstreamdemux

It may take some time to analyze whether this latency (1s) is normal.

What pipeline produces this 1 s on your side? I mean it must be possible to see that here too. Would you mind to share instructions?

It’s the pipeline you attached here #28.

You are showing me one of your test videos. This is not part of my pipelines. I need the exact syntax of your pipeline to follow.

Or even setup instructions for the uplink, access to the RTSP feed you were using.

I don’t believe that.

I did use the pipeline you attached. We don’t have to give you the wrong presentation. Let me show you my steps step by step.

  1. Please refer to our FAQ to build a rtsp source Build rtsp server
    Note: Please use the command below to generated the video with timestamp.
ffmpeg -re -stream_loop -1 -i sample_720p.mp4  -vcodec libx264 -vf "settb=AVTB,setpts='trunc(PTS/1K)*1K+st(1,trunc(RTCTIME/1K))-1K*trunc(ld(1)/1K)',drawtext=fontsize=100:fontcolor=white:text='%{localtime}.%{eif\:1M*t-1K*trunc(t*1K)\:d}'" -an -f rtsp -rtsp_transport tcp rtsp://127.0.0.1:8554/stream0
  1. Use our deepstream docker: nvcr.io/nvidia/deepstream:6.4-triton-multiarch to run the pipeline you attached:
gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvv4l2h264enc ! rtspclientsink location=rtsp://localhost:8554/output-stream \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0
gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvstreamdemux name=demux demux.src_0 ! \
nvv4l2h264enc ! rtspclientsink location=rtsp://localhost:8554/output-stream \
uridecodebin uri=rtsp://your-server:8554/input-stream ! mux.sink_0 
  1. Use VLC player to play the 2 rtsp sources generated.

The image I attached is a comparison of two generated rtsp sources played on my laptop using VLC player.

Ok. Thanks. I will test that.

OK, step by step.

STEP 1:

First I was using your way of publishing with the (unclear vegetable like "“settb=AVTB,setpts='trunc(PTS/1K)*1K+st(1,t…”

I was still using “my” RTSP server on that AWS instance, which is a copy of the great MediaMTX project.

Results are as expected: Input and output are a couple of hundred milliseconds apart without nvstreamdemux.

Pipeline 1:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvv4l2h264enc ! rtspclientsink location=rtsp://localhost:8554/output-stream \ 
uridecodebin uri=rtsp://127.0.0.1:8554/input-stream ! mux.sink_0

With nvstreamdemux output now is 2 seconds behind, no doubt.

Pipeline 2:

gst-launch-1.0 nvstreammux name=mux batch-size=1 width=1280 height=720  ! \
nvstreamdemux name=demux demux.src_0 ! \
nvv4l2h264enc ! rtspclientsink location=rtsp://localhost:8554/output-stream \
uridecodebin uri=rtsp://127.0.0.1:8554/input-stream ! mux.sink_0 

I was using ffplay for consumtion, but even with a gst-launch pipeline same results. 2 seconds. That’s not new. I also don’t believe that VLC would improve that.

STEP 2: I will now go and replace the RTSP server with the one you suggested, but I fear, that will not change anything. I’m sure your DS installation is in some unknown way different from mine.

STEP 2: is already done. I see, you are using the same RTSP server, just in a dockerized and older version.

docker run --rm -it --network=host aler9/rtsp-simple-server

So that will definitely not bring an improvement, but I can try.

Yes, as anticipated: Replacing the RTSP server by an older version of it and the docker version didn’t change the situation. I just tested the problematic pipeline: 2 seconds behind.

Now I’m going to use the dockerized 6.4

OK, I don’t know, what you are running there, but I would like to have that too.

With the docker image you told me - I hope you see the two seconds difference between input and output, do you?

Just to verify, that I did it right:

  • I followed the instructions here Docker Containers — DeepStream documentation 6.4 documentation
  • Created NGC account, pulled the docker image you told me, did install the additional drivers by /opt/nvidia/deepstream/deepstream/user_additional_install.sh, re-compiled GStreamer for the RTSP EoS issue by “update_rtpmanager.sh” at /opt/nvidia/deepstream/deepstream/
  • Ran the pipeline with the nvstreamdemux at the docker prompt.

And all I got was what I already had: 2 seconds latency.

And now?

I suppose, you are running 6.5 already, in which this bug has been fixed :))

And the best is still to come: The latency increases over time :) Very nice…

This does also explain, why I did see way higher latencies in the beginning. It’s now already 3 secs behind

EDIT: But it returns to the known 2 secs

Strange, if you wait long enough, it seems to be ok… This is new to me.

EDIT: But most the time the difference is 2 secs

Whatsoever, there is definitely a difference between with and without nvstreamdemux. But I suppose, you are not going to see or accept that, so I’m getting used to the thought to forget about this way of sharing a GPU…

We’ll analyze this latency issue about nvstreamdemux which may take some time. Thanks