Nvstreammux different stream FPS not handled correctly

Deepstream: 4.0.2
Platform: dGPU
API: python3-gst

I am trying to view multiple live streams (rtsp) using roughly the following pipeline: uridecodebin -> nvstreammux -> nvmultistreamtiler -> nvvideoconvert -> appsink.
It works but the problem is when the sources have different fps they are displayed out of sync.

Although it seems counter-intuitive, the faster FPS sources are laid behind the slower ones i.e. the visual timestamp of the slowest source is advancing normally, while the timestamps for the faster ones are advancing less and lagging increasingly.
To me this looks like nvstreammux is not pulling the buffers correctly, hence they queue up in the faster sources and generate lag.

This even happens with these parameters: (which should be fine according to my understanding of the deepstream 4.0 manual)

  • batch-size = numberOfSources
  • live-source = true
  • batched-push-timeout = 10 ms

I have tried many parameter combinations and the only way I can get the sources to stay in sync is like this: (again extremely counter-intuitively)

  • batch-size = numberOfSources
  • live-source = false
  • batched-push-timeout = 1 ms

Can you confirm that it’s a bug in the muxer? I could also work around this by setting a leaky queue between each source and the muxer, but I wanna make sure I understand what the heck is going on.

So I replaced “appsink” element with “nveglglessink” and the sync issue did not manifest (which is probably why no one else noticed this problem in the deepstream examples).

Upon further investigation, I found that nvstreammux parameters like “live-source” and “batched-push-timeout” are irrelevant to this. Instead, setting appsink “qos” property true (by default it is false) solved the problem.

I’ve got to say, without a thorough understanding of gstreamer internals (the api is full of leaky abstractions) it is quite difficult to work with deepstream. Thankfully, the flexibility and performance are good enough to compensate.

1 Like

Glad to know you had fixed the issue by yourself.

BTW, regarding your 1st comment, may I know the fps of the faster source? I saw your batch-push-timeout=10ms, is this time less than the frametime of the faster source?

I actually meant 100ms (10fps) but got caught in too many zeros. The fastest source is 20fps, the slowest is 15fps, but I am only interested in getting minimum 10 fps from the pipeline.

Anyhow now I am migrating to deepstream 5.0 and will test again.

It seems that qos is no longer required in ds5.0 for correct handling of different fps, but I can’t compare objectively because my code is now significantly different - I’m not using appsink anymore, as images can be extracted with pyds.

Anyhow I noticed another strange issue (which may or may not have been present in ds4.0). It happens when one of the rtsp sources goes offline, and the batch timeout is triggered.

  • If the remaining sources are faster than batched-push-timeout, then all of them lag behind. That means I am no longer getting realtime images, and everything is running in slow-motion.
  • If at least one of the remaining sources is slower or equal than batched-push-timeout, then all of them remain in sync and everything runs fine in real-time. This means some of them can be faster than the timeout without issue.

I tried putting a leaky queue between each source and streammux, but it didn’t make any difference. This is really really strange, it means that streammux has its own internal queues and they keep growing unbounded?

Of course, this could be worked around by setting batched-push-timeout to match the fastest possible source, but it’s not a good solution because it means a complete batch would never be completed even when all sources are online. Plus, in all the examples, the timeout has a ridiculously high value of 4 seconds…

I tried changing the mux batch-size on the fly based on missing source detection, but this parameter has no effect unless the mux is in the NULL or READY state. The documentation does not state anything with regards to changing batch size on the fly, but this seems to be the only way to work around the sync issue (until the code itself is fixed).

I’m wondering, does anyone use deepstream in production or only for research? I cannot imagine how we are supposed to deliver a commercial application with these kinds of issues…

Hi,

We encountered a similar problem with the batch-size. We had several RTSP streams that were not always available and we wanted to process them with the same DeepStream pipeline. As soon as one of the RTSP streams became unavailable the whole application would stop.

We solved this with a workaround: Fixing the batch size and using one of our products (GstInterpipe) to switch dynamically between the RTSP sources and test sources. Don’t know if this helps.

The GstInterpipe page mentions it as an open source project, yet I cannot find any reference to the source code. Anyway, how is it different / better than simply using appsink + appsrc or shmsink + shmsrc ?

HI,
GstInterpipe code is available on the official repository.

The objective of GstInterpipe is to separate individual pipelines while allowing communication between them. for example:

source + encode + sink

can become:

source +interpipesink
interpipesrc + endode + interpipesink
interpipesrc + sink
  • One of these pipelines failing won’t cause the others to fail
  • You can switch pipeline dynamically. Like changing the encoder from h264 to vp9 in the above example
  • You can use it as an advanced tee element (multiple interpipesrc listening to the same interpipesink)

@miguel.taylor how does the interpipe element play with deepstream components ?

We currently use GstInterpipe in a couple projects that involve DeepStream and it is working fine. There are some specific tweaks in the configuration of the interpipesrc and interpipesink to avoid caps renegotiation on some Deepstream elements that doesn’t support it, but otherwise it should work fine.

We have an example you can check that uses GstInterpipe and Gstreamer Daemon with deepstream:

Note: Gstreamer Daemon is basically gst-launch on steroids, we use it a lot with GstInperpipe because it lets us handle each sub-pipeline individually with a simple API

@miguel.taylor thanks for the info - will check it out. Whats the performance like when running multiple pipelines like this on jetson devices like the nano?

We have a project similar to that demo which runs on TX2 and Nano. We are able to run 2 instances in the nano with an RTSP input of 640x480 @ 15fps, the third instance causes a drop in the framerate. I tested that same stream without GStreamer Daemon and GstInterpipe, and obtained the same results.

1 Like