Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) Jetson Orin-nano
• DeepStream Version DS v7.1
• JetPack Version (valid for Jetson only) Just Tested on Jetson
• TensorRT Version 8.6.2.3-1+cuda12.2
• Issue Type( questions, new requirements, bugs) questions
• How to reproduce the issue ? Run the given pipeline in more than 6 different threads.
• Requirement details
In an application we need to read RTSP streams (H264) of more than 100 cameras and decode their streams every 1 second in a 200 ms time window. When we use GStreamer software decoders like avdec_h264, the CPU reaches its limits and gives a lot of latency (more than 1 second for 10 cameras), and most streams miss frames for a relatively long time period (sometimes 1-2 seconds).
pipeline_str = g_strdup_printf("rtspsrc location=%s name=rtsp protocols=tcp drop-on-latency=false latency=50 ! rtph264depay name=rtpdepay request-keyframe=true wait-for-keyframe=true ! h264parse ! avdec_h264 ! valve drop=true name=valve ! videoconvert ! video/x-raw,format=RGB ! pngenc compression-level=0 ! appsink name=appsink sync=true emit-signals=false drop=true max-buffers=1", URL);
When we use the deepstream component as below, it reduces this time incredibly to less than 15 ms, but it gives us a core dump error for more than 6 cameras.
pipeline_str = g_strdup_printf("rtspsrc location=%s name=rtsp protocols=tcp drop-on-latency=false latency=50 ! rtph264depay name=rtpdepay request-keyframe=true wait-for-keyframe=false ! h264parse ! nvv4l2decoder ! nvvideoconvert ! video/x-raw, format=(string)I420 ! nvjpegenc quality=100 ! appsink name=appsink sync=true emit-signals=false drop=true max-buffers=1", URL);
I think, since the DeepStream components are using the dedicated hardware nvdec, it reaches its limits.
Q1: is there any component for nvv4l2decoder, and nvjpegenc to do the decoding batch by batch? for example, 10 input streams at a time?
Q2: is it possible to use NVDEC library to create a gstreamer plugin to do this (below image).
Please click here to see the image
For example, suppose we have 100 cameras. The custom plugin (based on NVDEC) will process 5 camera streams at a time. It gets 5 buffers from each h264parse at a time, decodes it, extracts the frame, encodes it to jpg, and then starts on the next batch.