Delay in output RTSP sink when pipeline framerate is lower than source RTSP framerate

• Hardware Platform (Jetson / GPU) - Jetson TX2
• DeepStream Version - v5.0.1
• JetPack Version (valid for Jetson only) - 4.4, L4T 32.4.3
• TensorRT Version - 7.1.3
• Issue Type( questions, new requirements, bugs) - question

I am using a single 30fps RTSP source to my Deepstream pipeline based off of https://nvcr.io/nvidia/deepstream:5.0.1-20.09-base and have configured an RTSP sink. Inference runs at about 14-17fps, which is causing an ever-increasing delay in the sink RTSP stream where, ideally, the sink RTSP stream would be as near-real-time as possible. I have proved that when the source RTSP framerate is lowered (i.e. 15fps), the delay in the sink RTSP stream goes away. Looking at the sink group configuration parameters, I would expect the qos flag to handle this to keep the RTSP sink as close to real-time as possible, despite the inference framerate - but I could be wrong here.
I know that a potential solution would to be reduce the input dimensions, increase the inference interval, or increase the drop-frame-interval on the source to increase the inference FPS, but all of these options will cause a decrease in model accuracy, which I would like to avoid.
I have tried setting live-source in the streammux group to 1 and sync/qos in the sink group to 0, per this FAQ question, but this also doesn’t seem to reduce the delay.

My question is: What is the best way to reduce the delay between the source and sink RTSP streams without compromising model accuracy? I’d like the sink RTSP stream to be as close to source RTSP, regardless of the inference/model speed.

My source/sink Deepstream configuration is as follows:

[source0]
# See https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_ref_app_deepstream.html#source-group
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 5= Camera(CSI)
type=2
num-sources=1
uri=rtsp://10.1.10.97:8554/testRTSP
rtsp-reconnect-interval-sec=10
gpu-id=0
latency=400
cudadec-memtype=0
camera-fps-n=0
camera-fps-d=1
camera-width=3840
camera-height=2160

[streammux]
gpu-id=0
batch-size=1
batched-push-timeout=-1
## Set muxer output width and height
width=1260
height=720
nvbuf-memory-type=0
live-source=1

[sink0]
# See https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_ref_app_deepstream.html#sink-group
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSP
type=4
bitrate=4000000
nvbuf-memory-type=0
rtsp-port=8554
udp-port=5400
width=1260
height=720
source-id=0
gpu-id=0
# for mp4 output
container=1
codec=1
qos=0
sync=0

Thanks!

Hi,
You can set type=1 in [sink0]

#Type - 1=FakeSink 2=EglSink 3=File 4=RTSP
type=1

If it still cannot achieve target fps, it looks like the mode is too heavy. Suggest you reduce the model loading or set drop-frame-interval.

You can run sudo tegrastats to check GPU loading. See if it is at 100% always.
And can compare to different models such as ResNet10.

Thanks for your response @DaneLLL !

Setting the sink to a FakeSink, I’m seeing no improvement in pipeline FPS (per the PERF** printouts). Looking at the GPU load, it is sitting at around 100% at all times while the application is running.

Increasing the drop-frame-interval to reduce the load does help to eliminate the sink stream delay, but this affects analytic performance as only 50% of frames are being processed on the lowest drop-frame-interval setting (2), then even less for every value greater than that (3 = 33% of frames processed, 4 = 25%, etc).

I have tried with different models such as TraficCamNet and PeopleNet and I know that there is no delay in the sink stream when using these, as the device can keep up with the source framerate when using these models. However, I am looking for a solution without having to adjust the model being used.

What I’m really looking for here is a way to use any model, have it run as quickly as the device can handle (drop-frame-interval=0), and still have the output/sink RTSP stream be as real-time as possible, without falling behind. Is this possible, or will we need to have some sort of compromise (using a lighter model, dropping frames, etc.) to achieve a real-time sink stream?

Hi,
For TX2, it probably is the constraint of GPU engine. You would need to consider to use lighter model.

Or you may try other platforms such as Xavier, XavierNX. The GPU engine is more powerful and may be able to achieve target fps.

Hello,
Thank you for the input @DaneLLL.

We resolved this issue regarding the sink RTSP stream delay by setting a smaller max buffer values ( max-size-time, max-size-bytes, max-size-buffer) on the source GstQueue element, as well as configuring the queue to leak downstream.
This reduces the stream delay to be, at most, around 1.3-1.5 seconds behind the source RTSP stream at any given time for a single source input.

1 Like

Hi cpeskin,
Can you give more details of the solutions, I am facing the same issue and want to know how to adapt processing pipeline as close as source frame rate.

Hi @v.hunglx2 ,

Our solution did not increase the speed of our processing pipeline to match the source frame-rate. Rather, we knew that our model was too “heavy” and we could not process frames quick enough to match the source frame rate, which would cause a delay between the source stream and sink stream. We were looking for a solution to ensure that we are always processing the latest frame from the source, regardless of our pipeline speed, so that there was never a large delay between source and sink.
As mentioned in my last reply, we accomplished this by setting small max buffer parameters for the source GstQueue queue element, as well as configuring this queue to leak downstream. Although this didn’t increase our pipeline frame-rate or processing speed, we are now buffering (at most) only a few frames so there is no significant delay between the source RTSP stream and sink stream.

Hope this helps!