According to RTSP spec RFC 2326: Real Time Streaming Protocol (RTSP) (rfc-editor.org), what is transferred through RTSP is encoded video stream and the control data, the data packets are not in frame, how can you drop frames?
I’m aware of the underlying video transport and the network abstraction layer. Dropping frames obviously can only happen after NAL units were decoded to actual frames via nvv4l2decoder. In fact, nvv4l2decoder offers that feature for dropping every nth frame statically via drop-frame-interval as discussed already. But thanks for clarifying that again. Video decode is not the bottleneck, so we will happily decode all frames even if some of them will be dropped downstream.
We have “drop-frame-interval” property with nvv4l2decoder plugin.
Not of any use to us in real world applications, as there is no way to set an actual target framerate. If we set this to “2”, our 30 fps video stream will become 15 fps. But our 10 fps stream will become 5 fps and now data quality from motion tracking might probably start to degrade. Our customers define their own RTSP sources from their cameras, so there is no way for us to know what the input framerate will be and which drop-frame-interval to configure to achieve some sort of “per-videostream-compute-budget”, if that makes sense.
What does “synchronisation” mean?
Simplified: If the presentation timestamp (PTS) which is attached to the GstBuffer currently being up for processing by the nvinfer stage substantially deviates from the real world clock, the pipeline is not synchronised anymore. We are searching for a best-effort solution which will synchronize this by “skipping” some of the next buffers.
Currently there is no way inside DeepStream to drop frames dynamically.
For multiple input sources deepstream already schedules batches dynamically to my best understanding. I don’t see why the nvstreammux (or nvinfer directly) wouldn’t be able to look at the PTS and detect whether the pipeline is desynced in a configurable “max live latency” setting. There is actually a gstreamer plugin called videorate which does something very similar. It will drop or duplicate frames until a new target framerate is achieved. Unfortunately this still is not dynamic - so if the total pipeline latency crosses a certain threshold we are back to non-realtime execution again and it’s very hard to know at how many concurrent streams that happens exactly.
@Fiona.Chen would it be possible to check again whether the new nvstreammux plugin (or potentially any other plugin) could support such a “max-live-latency” setting? In that mode frames would only be scheduled if they aren’t too old according to their PTS and compared against the max-live-latency value. Otherwise it would be dropped.
Furthermore, GStreamer itself has features to tackle this issue. The QoS article goes into more detail about this, especially how video decoders can be setup to handle QoS messages emitted from pipeline sinks to drop frames dynamically. I don’t believe nvv4l2decoder currently implements this appropriately?
Thanks for the support so far.