Nvv4l2decoder Deepstream bottleneck with Dahua IPC-HDW3441T-ZAS camera

joranvandew · January 16, 2023, 1:44pm

We have set up a Deepstream pipeline that takes a h264 encoded RTSP-stream, decodes it and performs inference on it using a yolo detector.
The pipeline looks a bit like the following:
rtspsrc → rtph264depay → queue → nvv4l2decoder → queue → nvstreammux → nvinfer → nvstreamdemux → nvvideoconvert → nvdsosd → queue → fakesink

We perform inference at 15FPS and the camera stream is configured at 15FPS as well. We allow up to 6 camera’s to be connected at the same time, which we all pass to nvstreammux. The inference interval is set depending on the amount of camera’s: with two camera’s the inference interval is set to 1 so that each camera gets inference at 7.5FPS, to optimally utilize the 15FPS that is available.
In case the system gets slowed down, a queue after the nvv4l2decoder is used (second queue in the pipeline) that can store up to 20 frames. As soon as this limit is reached, frames are dropped to ensure that no more delay is built up.

For most camera’s that we use this system works fine, for Dahua camera’s however we notice that frames are never dropped in the second queue but instead over time pile up in the first queue, indicating that the nvv4l2decoder can not decode frames fast enough. This is especially the case when multiple Dahua camera’s are connected to the system and more frames need to be decoded.
A specific thing with regards to the Dahua camera is that frames seem to arrive in bursts in the rtspsrc instead of in a steady stream.

Normally speaking, we expect the nvinfer module to be the bottleneck, which means we can drop decoded frames if nvinfer is too slow. In this case however, we would have to drop encoded frames because the decoder is too slow, potentially introducing artefacts.
Is there a way to resolve this bottleneck, so we don’t have to drop encoded frames?

Things we have already tried:

set enable-max-performance on true
set disable-dpb on true
set enable-full-frame on true
different h264 encoding schemes (Main, Baseline, …)
modifying the i-frame interval to be smaller or bigger
using h265 instead of h264

None of these have resolved the issue of the frames piling up before the encoder.
We have also verified using jtop that the NVDEC hardware engine is actually working when the pipeline is running.

Thank you very much for your time.

Platform info:
**• Jetson Nano **
**• Deepstream 5.0.1 **
**• Jetpack 4.4 **
**• TensorRT Version 7.1.3 **
• Issue Type: question/bug

Fiona.Chen · January 17, 2023, 1:46am

Have you checked the streams from Dahua camera? Can you provide the caps of the stream with “export GST_DEBUG=rtspsrc:5”? Is the FPS in caps correct?

joranvandew · January 17, 2023, 10:40am

We have checked the streams from the Dahua camera and they run at the correct framerate and do not build up a delay in the camera itself.

The caps for the stream is the following:
stream 0x7f2001b380, pt 96, caps application/x-rtp, media=(string)video, payload=(int)96, clock-rate=(int)90000, encoding-name=(string)H264, packetization-mode=(string)1, profile-level-id=(string)4D6033, sprop-parameter-sets=(string)"J01gM4mNUCgC3QgAAAMACAAAAwDwIAA\=\,KO48gAA\=", a-packetization-supported=(string)DH, a-rtppayload-supported=(string)DH, a-framerate=(string)15.000000, a-recvonly=(string)"", ssrc=(uint)160300835, clock-base=(uint)272732435, seqnum-base=(uint)50014, npt-start=(guint64)0, play-speed=(double)1, play-scale=(double)1

joranvandew · January 17, 2023, 11:40am

There are two corrections to my initial post:

The frames from the rtspsrc do not actually arrive in bursts. The rtp packets do, but as soon as they are depayed it’s a steady stream.
We noticed that when we set the i-frame interval to a larger value like 75 (5 seconds at 15FPS), the decoder bottleneck seems to disappear. Our current i-frame setting is 15 (one i-frame a second) and we want to keep it this way.

An extra observation:
When we reduce the size of the second queue (the queue that drops frames after the decoder) to 2 instead of 20, we only get about 2-3FPS inference instead of the actual 15FPS. I assume this happens because with a small queue the nvinfer is waiting on the output of the decoder, whereas with a larger queue nvinfer can keep taking frames ensuring 15FPS inference.

Fiona.Chen · January 17, 2023, 12:11pm

If you think the nvv4l2decoder is the bottleneck, you may test the decoding performance with a simple pipelien with decoder and fpsdisplaysink.

rtspsrc → rtph264depay → nvv4l2decoder-> fpsdisplaysink

joranvandew · January 19, 2023, 4:01pm

Thanks, with this pipeline we have the following observations:

The pipeline you suggested runs at 15FPS without issues, there is no delay or lower FPS.
When I introduce a queue that drops frames: “queue max-size-buffers=2 max-size-time=0 max-size-bytes=0 leaky=2” after the decoder, the FPS drops to 1FPS. When the i-frame interval is set to 75 instead of 15, the FPS stays at 15 instead of dropping to 1FPS.
When I set the queue size to max 30 buffers instead of 2, we reach 15FPS again.
However, when the inferencing components are reintroduced (nvstreammux, nvinfer, nvstreamdemux, nvosd), a delay starts slowly accumulating in the rtspsrc and frames are never dropped in the dropqueue.

My questions are then:

How could the queue with max buffer size 2 drop so many decoded frames after being introduced after the decoder, reducing the framerate to 1 FPS? It does not do this for other camera streams and it does not do this for longer i-frame intervals.
Increasing the queue size to 30 seemingly makes the problem go away, but actually makes frames slowly pile up before the decoder (about 2 minutes of delay over 5 days times). If the decoder is not the bottleneck (we expect the inference to be the bottleneck), why are these frames not processed by the decoder and then dropped in the queue if the inference is too slow?

Fiona.Chen · January 30, 2023, 6:51am

I can not reproduce the issue you met with our own rtps source.

gst-launch-1.0 rtspsrc location=rtsp://xxxxxxx Setting pipeline to PAUSED …
0:00:00.036562923 914 0x56127ca4cd60 DEBUG Pipeline is live and does not need Progress: (open) Opening Stream
Progress: (connect) Connecting to Progress: (open) Retrieving server Progress: (open) Retrieving media info
Progress: (request) SETUP stream 0
Progress: (open) Opened Stream
Setting pipeline to PLAYING …
New clock: GstSystemClock
Progress: (request) Sending PLAY Progress: (request) Sending PLAY Progress: (request) Sent PLAY request
0:00:07.184726133 914 0x56127ca74300 DEBUG 0:00:07.184788314 914 0x56127ca74300 DEBUG 0:00:07.184814345 914 0x56127ca74300 LOG 0:00:08.217022979 914 0x56127ca74300 DEBUG 0:00:08.217074111 914 0x56127ca74300 LOG 0:00:09.249183073 914 0x56127ca74300 DEBUG 0:00:09.249239508 914 0x56127ca74300 LOG 0:00:10.250133448 914 0x56127ca74300 LOG 0:00:11.251001398 914 0x56127ca74300 LOG 0:00:12.266281890 914 0x56127ca74300 LOG 0:00:13.286279881 914 0x56127ca74300 DEBUG 0:00:13.286325717 914 0x56127ca74300 LOG 0:00:14.287257036 914 0x56127ca74300 LOG 0:00:15.288276452 914 0x56127ca74300 LOG 0:00:16.289236155 914 0x56127ca74300 LOG 0:00:17.290234023 914 0x56127ca74300 LOG 0:00:18.291276174 914 0x56127ca74300 LOG 0:00:19.292289693 914 0x56127ca74300 LOG 0:00:20.293252645 914 0x56127ca74300 LOG 0:00:21.294278404 914 0x56127ca74300 LOG 0:00:22.295259586 914 0x56127ca74300 LOG 0:00:23.296291059 914 0x56127ca74300 LOG 0:00:24.297225619 914 0x56127ca74300 LOG 0:00:25.298217610 914 0x56127ca74300 LOG 0:00:26.299230117 914 0x56127ca74300 LOG 0:00:27.300229733 914 0x56127ca74300 LOG 0:00:28.301291272 914 0x56127ca74300 LOG 0:00:29.302346853 914 0x56127ca74300 LOG 0:00:30.303343389 914 0x56127ca74300 LOG 0:00:31.304389540 914 0x56127ca74300 LOG 0:00:32.305374348 914 0x56127ca74300 LOG 0:00:33.306383275 914 0x56127ca74300 LOG ^Chandling interrupt.
Interrupt: Stopping pipeline …
Execution ended after 0:00:28.405835175
Freeing pipeline … ! rtph264depay ! nvv4l2decoder ! queue max-size-buffers=2 max-size-time=0 max-size-bytes=0 leaky=2 ! fpsdisplaysink video-sink=fakesink fps-update-interval=1000 signal-fps-measurements=TRUE
fpsdisplaysink fpsdisplaysink.c:440:fps_display_sink_start: Use text-overlay? 1
PREROLL …
rtsp://xxxxxxxx
options
request
request
fpsdisplaysink fpsdisplaysink.c:372:display_current_fps: Updated max-fps to 1.166273
fpsdisplaysink fpsdisplaysink.c:376:display_current_fps: Updated min-fps to 1.166273
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:1.166273 droprate:0.000000 avg-fps:1.166273
fpsdisplaysink fpsdisplaysink.c:372:display_current_fps: Updated max-fps to 6.780963
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:6.780963 droprate:0.000000 avg-fps:3.276103
fpsdisplaysink fpsdisplaysink.c:372:display_current_fps: Updated max-fps to 30.034325
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:30.034325 droprate:0.000000 avg-fps:10.583919
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.971667 droprate:0.000000 avg-fps:14.643543
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:28.974606 droprate:0.000000 avg-fps:17.124649
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:18.714105 droprate:0.000000 avg-fps:17.362089
fpsdisplaysink fpsdisplaysink.c:372:display_current_fps: Updated max-fps to 30.392076
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:30.392076 droprate:0.000000 avg-fps:19.062435
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:19.980651 droprate:0.000000 avg-fps:19.166673
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.969280 droprate:0.000000 avg-fps:20.268040
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.971276 droprate:0.000000 avg-fps:21.165739
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:27.972457 droprate:0.000000 avg-fps:21.742153
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.968438 droprate:0.000000 avg-fps:22.384434
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.969867 droprate:0.000000 avg-fps:22.933764
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.970810 droprate:0.000000 avg-fps:23.408952
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.969287 droprate:0.000000 avg-fps:23.823947
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.970328 droprate:0.000000 avg-fps:24.189612
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.969286 droprate:0.000000 avg-fps:24.514164
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.971938 droprate:0.000000 avg-fps:24.804321
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.970166 droprate:0.000000 avg-fps:25.065108
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.969662 droprate:0.000000 avg-fps:25.300810
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.970229 droprate:0.000000 avg-fps:25.514918
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.968430 droprate:0.000000 avg-fps:25.710184
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.968303 droprate:0.000000 avg-fps:25.889042
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.969707 droprate:0.000000 avg-fps:26.053529
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.968931 droprate:0.000000 avg-fps:26.205243
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.970248 droprate:0.000000 avg-fps:26.345682
fpsdisplaysink fpsdisplaysink.c:380:display_current_fps: Signaling measurements: fps:29.970228 droprate:0.000000 avg-fps:26.476019
>

Fiona.Chen · January 30, 2023, 9:22am

So the problem may related to the stream generated by Dahua camera. Please check the timestamp of the frames generated with Dahua camera.

joranvandew · January 30, 2023, 12:31pm

The problem does seem to be uniquely linked to the Dahua camera stream.
In the end, I circumvented the issue by introducing a probe after the decoder that checks if there are frames piling up in a queue before the decoder, and as soon as a certain amount of frames are reached frames are dropped in the probe (after the decoder).
This way only decoded frames are dropped and no delay is introduced.
I tested this solution and it seems to work fine.
Thanks for the help.

system · February 13, 2023, 12:31pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.