All frames captured from RTSP source is not processed by deepstream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Tesla P100
• DeepStream Version 5
• NVIDIA GPU Driver Version (valid for GPU only) 440+

The experiment runs the deepstream-test5 application on a single rtsp source which sends limited number of frames to the pipeline (around 300). The pipeline contains only a primary detector apart from the required components. The application runs in the docker container from ngc.

Changes made in the default application

  1. Probe added to the rtspsrc element to get info of the frames received by the application.
  2. Custom logs added in various points in the pipeline.

Following is my observation.

  1. The application do not process a few starting frames received from the rtsp source (For example: for my video source out of 323 frames only 73 frames are processed)
  2. The above number might vary from source to source.
  3. For some rtsp sources the application starts instantly, however, for some sources the application starts processing late, like in the first point.
  4. Just to clarify, all these 323 frames are captured by the application (checked by adding a probe to the rtsp source), but somehow are not processed or dropped.
  5. On further looking at element wise logs, the issue occurs due to decodebin element, specifically, the nvv4l2decoder element. All elements before nvv4l2decoder received all 300 frames, but the nvv4l2decoder element only forwards 73 frames.

application_custom.log (125.1 KB)
The following logs are my custom logs along with nvstreammux gstreamer logs to get idea of what is happening. In this file, the logs corresponding to deepstream_source_bin.cpp:1290:deepstream_process_rtp_buffer is the probe on the rtspsrc, hence it displays all the incoming packets. The output log after processing is mentioned by deepstream_app_main.cpp:524:bbox_generated_probe_after_analytics <frame_no>. The frame no in the above 2 logs might not corresponding to the same value, but just represent S.No for the incoming and outgoing frames. As you can see, the number of frames received is 323 however, the processed frames are only 73, and these are the last 73 frames from the rtsp source.

I was able to reduce the starting latency by setting latency parameter in source for rtsp with tcp protocol. However, the following parameter did not work for the rtsp streams over udp protocol.

Questions

  1. How can I reduce the latency of the nvv4l2decoder element? I need to ensure all the elements captured by the rtspsrc element should be processed.

Can you reproduce the frame dropping by the original deepstream-test5? If so, can you upload your config file for check?

Hi @Fiona.Chen,

I am able to reproduce the same issue with deepstream-test5 also. I cannot produce the logs as I gave with the customized code. However, I saved the video and observed only 73 frames processed while the original stream had 323.

To observe if the remaining frames are received by the application or not, please put a probe on rtspsrc.

Please find the config files for the application.
application_config.txt (2.1 KB) detector_config.txt (752 Bytes)

For application_config.txt:
Please set “live-source=1” and “batch-size=1” in “[streammux]” part, please add “sync=0” in “[sink0]” part.

Hi @Fiona.Chen,

I observe the same behaviour even with those options.

According to your description, you are comparing rtp packet number with the decoded video frame number, right? I didn’t know how did you count that, but it may not be reasonable. Take H264 video for example, according to the RTP spec and H264 rtp payload spec RFC 3984 - RTP Payload Format for H.264 Video, the rtp packet for H264 is different for different packetization modes, it may not be exactly one compressed video frame in one rtp packet.

Hi @Fiona.Chen, I am not comparing the number of RTP packets with the number of frames. However, I am deriving the frame number from RTP packets. I agree that there can be multiple RTP packets for a single frame, but they will have the same RTP timestamp. Hence, the problem boils down to count the unique timestamps received by the application. And I have found that the number of frames that I calculate using RTP packets is the same sent by the RTSP source.

Apart from the above method, I saved the video from the pipeline and saw that the number of frames is less than the RTSP stream.

 if (!gst_rtp_buffer_map (buffer, GST_MAP_READ, &rtp_buffer)) {
    return FALSE;
  }

 ts = gst_rtp_buffer_get_timestamp (&rtp_buffer);
 seq_number = gst_rtp_buffer_get_seq (&rtp_buffer);

 if (rtp_timestamp_framenum_map.find(ts) == rtp_timestamp_framenum_map.end()) {
         last_index_frame_number += 1;
         rtp_timestamp_framenum_map[ts] = last_index_frame_number;
  }

If there is any nal unit containing coded slice data lost in decoder, the decoded video will be corrupted. Have you observed the corruption? rtp packet cantains nal units, but there are lots of nal units types. Not all nal units are coded slice data. H.264 : Advanced video coding for generic audiovisual services (itu.int)