RTSP latency does not work with NVSTREAMMUX

• Hardware Platform (Jetson / GPU) Jetson Nano
• DeepStream Version V5.0 GA
• JetPack Version (valid for Jetson only) 4.4

Hi all,

I am having an issue with getting smooth video playback when streaming from my IP cameras on my Jetson Nano. This can be seen from the deepstream-app sample application, even with only one RTSP source and inferencing and tracking turned of. The displayed video seems to skip frames, and is very noticeable when there are moving objects (people or cars).

After further investigation using only gst-launch-1.0, it seems that nvstreammux does not respect any latency on the rtspsrc.

Running either of the following commands (without nvstreammux) produces buttery smooth playback. Note that I have specified an unusually long latency here, generally 500ms works good for me. The default in the deepstream-app is 100ms which produces the jumpiness. But the 5 second latency in the examples demonstrates this the best. On running the command, the video playback shows one frame and then pauses for 5 seconds, after which it streams smoohtly.

gst-launch-1.0 rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! nvoverlaysink display-id=1
gst-launch-1.0 rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! nvvideoconvert ! nvegltransform ! nveglglessink`

Adding nvstreammux to the pipeline causes the 5s buffer to be “forwarded”/“eaten up”. On running the command, the video playback also shows the one frame, pauses for 5 seconds, and then “forwards” the video until the 5s buffer is gone, and the jumpiness returns. Note that this is a 1920x1080 @ 25fps stream.

gst-launch-1.0 \
  rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! mux.sink_0 \
  nvstreammux name=mux batch-size=1 batched-push-timeout=40000 width=1920 height=1080 live-source=true \
  ! nvoverlaysink display-id=1

gst-launch-1.0 \
  rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! mux.sink_0 \
  nvstreammux name=mux batch-size=1 batched-push-timeout=40000 width=1920 height=1080 live-source=true \
  ! nvegltransform ! nveglglessink sync=false

Any help in this regard would be appreciated.

Tagging @jasonpgf2a @mdegans @miguel.taylor @DaneLLL @bcao @mchi

Regards.

Hi,
The reference config file for Jetson Nano is

source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_nano.txt

Please set sync=0 in [sink0] and give it a try.
And if your source is not in 30fps, you would need to adjust batched-push-timeout in [streammux].

My deepstream-app config file is based of the reference file you mentioned. I have setup my 8 RTSP sources, configured sync=0 on [sink0] and live-source=1 on the [streammux] as per the Deepstream FAQ with no difference. As mentioned, this is not related to an inferencing bottleneck as the issue persists event without inferencing and tracking (please see my post again)

Please try out the gst-launch-1.0 commands I have provided against a RTSP stream and you will see what I mean.

Please note the following gst-lauch-1.0 command using nvcompositor respects the RTSP’s latency and provides super smooth streaming.

gst-launch-1.0 \
  rtspsrc location=$RTSP_PATH1 latency=1000 ! decodebin ! queue ! comp. \
  rtspsrc location=$RTSP_PATH2 latency=1000 ! decodebin ! queue ! comp. \
  rtspsrc location=$RTSP_PATH3 latency=1000 ! decodebin ! queue ! comp. \
  rtspsrc location=$RTSP_PATH4 latency=1000 ! decodebin ! queue ! comp. \
  rtspsrc location=$RTSP_PATH5 latency=1000 ! decodebin ! queue ! comp. \
  rtspsrc location=$RTSP_PATH6 latency=1000 ! decodebin ! queue ! comp. \
  rtspsrc location=$RTSP_PATH7 latency=1000 ! decodebin ! queue ! comp. \
  rtspsrc location=$RTSP_PATH8 latency=1000 ! decodebin ! queue ! comp. \
  nvcompositor name=comp \
  sink_0::xpos=0 sink_0::ypos=0 sink_0::width=640 sink_0::height=360 \
  sink_1::xpos=640 sink_1::ypos=0 sink_1::width=640 sink_1::height=360 \
  sink_2::xpos=1280 sink_2::ypos=0 sink_2::width=640 sink_2::height=360 \
  sink_3::xpos=0 sink_3::ypos=360 sink_3::width=640 sink_3::height=360 \
  sink_4::xpos=640 sink_4::ypos=360 sink_4::width=640 sink_4::height=360 \
  sink_5::xpos=1280 sink_5::ypos=360 sink_5::width=640 sink_5::height=360 \
  sink_6::xpos=0 sink_6::ypos=720 sink_6::width=640 sink_6::height=360 \
  sink_7::xpos=640 sink_7::ypos=720 sink_7::width=640 sink_7::height=360 \
  sink_8::xpos=1280 sink_8::ypos=720 sink_8::width=640 sink_8::height=360 \
  ! nvoverlaysink display-id=1 

The issue seems to be with nvstreammux not respecting the latency of the rtspsrc

Hi,
Does it work if you run

gst-launch-1.0 \
  rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! mux.sink_0 \
  nvstreammux name=mux batch-size=1 batched-push-timeout=40000 width=1920 height=1080 live-source=true \
  ! nvoverlaysink sync=0

We would suggest run in sync=0 for RTSP sources. Would like to know if this mode works.

Hi @DaneLLL,

Adding sync=0 does not help. By adding sync=0 to any of the above samples causes the streams to start playing immediately with no buffering, and the jumpiness then occurs on all samples, even those without nvstreammux.

The question would then be with sync=0 on the sink, how to ensure a buffer for each stream of 500ms, for example?

Hi,
So we don’t handle the case in 5.0 GA. We will reproduce the issue and evaluate to support it.

Would like to confirm if the issue can be observed by running the pipeline:

gst-launch-1.0 \
  rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! mux.sink_0 \
  nvstreammux name=mux batch-size=1 batched-push-timeout=40000 width=1920 height=1080 live-source=true \
  ! nvoverlaysink

And comparing the case with/without latency=5000?

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Sorry for the late reply.

Yes, this does occur with the above pipeline with latency=5000. It will show the initial frame, then wait 5000ms (5sec) and start playing. If one looks at a rtsp source with an embedded timestimp (e.g. IP camera) one will see it fast-forwards the stream until is catches up with the latest frame and the jumpiness/laggy display issue happening.

If one runs the above pipeline without the latency=5000, it defaults to 2000ms of the rtspsrc. In this case it will behave the same as above, but will wait only for 2sec and will catch up faster - with the jumpiness/laggy display issue occuring.

If one runs the above pipeline with latency=0, it will start playing immediately and the jumpiness/laggy display issue will also be visible.

Therefere, it looks like the nvstreammux does not honour the rtspsrc’s latency.

Running the following pipeline, it will show the initial frame, then wait 5000ms (5sec) and then start playing. There will be now fast-forward, and the stream will play back smoothly.

gst-launch-1.0 rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! nvoverlaysink

Therefore it seems to be an issue with nvstreammux

@DaneLLL any confirmation on the following remark and my answer above?

Hi,
We can observe the issue. With nvstreammux in pipeline, the video playback is laggy. Will do further check and update.

Thanks for letting me know.

I appreciate you looking into this, as the visual impression a user currently gets is not good.

@DaneLLL, something I have noticed while testing this again.

If I run the compositor sample above with 8 RTSP streams with different latency values, jtop reports the following:

  • Smaller than 500 (ms) NVDEC is running at 716MHz
  • Between 500 and 1000 (ms), NVDEC is running at 396MHz, and intermittently up to 650MHz
  • Greater than 1000 (ms), NVDEC is consistently running at 396MHz

A latency of 500ms provides super smooth playback for me.

When running your latest provided command:

gst-launch-1.0 \
  rtspsrc location=$RTSP_PATH1 latency=5000 ! decodebin ! mux.sink_0 \
  nvstreammux name=mux batch-size=1 batched-push-timeout=40000 width=1920 height=1080 live-source=true \
  ! nvoverlaysink

jtop reports the following:

  • NVDEC=Off after the first frame is shown and the playback is paused/buffered
  • NVDEC=396MHz when is starts playing and fast-forwards after the 5 seconds pause
  • NVDEC=716MHz after it has catched up (and the lagginess returns)

It might suggest that the hardware decoder is overloaded and becomming a bottleneck due to no “latency buffer”.

Hi,
Please try to add add nvstreamdemux in the pipeline:

$ gst-launch-1.0 rtspsrc location=rtsp://10.19.107.87:8554/test latency=5000 ! decodebin ! mux.sink_0 nvstreammux name=mux batch-size=1 batched-push-timeout=40000 width=1920 height=1080 live-source=true ! nvstreamdemux name=demx demx.src_0 ! nvoverlaysink

We can run the pipeline without seeing laggy video playback. Since nvstreammux batches the sources to GstBuffer, need nvstreamdemux to de-batch it.

@DaneLLL, the pipeline you provided does work with no laggy display. It shows the initial frame, then waits for 5000ms (5sec) and then starts playing. There is also no “fast-forward”. It does not work if sync=false is set on the sink as recommended

The problem though is that this option does not work with the general deepstream pipeline (source(n) -> nvstreammux -> nvinfer -> nvtracker -> nvtiler -> nvosd -> sink) where one uses nvtiler for tiled display.

It also does not work with the provided Deepstream samples (e.g. deepstream-app) as mentioned in my initial post.

Hi,
We will check the general deepstream pipelines and update.

Hi,
We have not tested this kind of scenario - it is like a feature enhancement. No solution available for 5.0 GA release as it needs changes in nvstreammux component. would like to understand why this specific requirement or use-case so that we can evaluate to support it in future release.

@DaneLLL thank you for your confirmation that nvstreammux does not handle this general case correctly.

In my use-case:

  • I need to process 8x 1080p RTSP streams in real-time in the Jetson Nano, and maybe more streams with lower resolution.
  • I am already looking to expand onto the Jetson Xavier NX to process even more 1080p RTSP streams
  • I need to display all streams tiled, or a specific stream on a connected display

Therefore I need the visual display to be smooth without any laggy/stuttering issues, as the visual impression a client gets by looking at the display is that the software is not processing in realtime / is laggy. This is really a show stopper!

In my test case the network latency is very low, and smooth streaming is possible as can be confirmed with the nvcompositor sample above. This is not the case with nvstreammux in the pipleine, resulting in poor user perception.

I would further argue that having a “latency buffer” is very important to provide smooth stream playback in any software. Looking at the following defaults also suggests this (although I would run at lower latencies):

As mentioned above this will probably alleviate the pressure on the decoder as well