• Hardware Platform (Jetson / GPU): GPU, RTX 3090
• DeepStream Version: 6.0
• TensorRT Version: 8.0.1
• NVIDIA GPU Driver Version (valid for GPU only): 495.29.05
• Issue Type( questions, new requirements, bugs): questions
If I measure the FPS of the pipeline in the fakesink element and for example the measured value is 10 FPS, how can I know if it just means, that the original RTSP stream has only 10 FPS and the pipeline handles the RTSP stream well in terms of performance or if it means the RTSP stream has 30 FPS but there is a bottleneck in the nvinfer element caused by heavy model which can perform only 10 FPS?
To know that, I tried to measure the FPS on 2 points of the pipeline. I decided to measure the original FPS of the RTSP stream in the nvstreammux element and the final FPS in the fakesink element. The idea was following: if both values of FPS will be the same, the pipeline can handle the RTSP stream well. If not (it means the FPS value of the fakesink element will be smaller and FPS value of the nvstreammux will be the same as the RTSP stream) the pipeline can not handle the RTSP stream well. Also, in such case, the pending frames should be queued in the queue1 element.
The deepstream pipeline is following:
uridecodebin -> nvstreammux -> queue1 -> nvinfer -> queue2 -> fakesink
For my example I use IP camera sending 30FPS. As far as the models, I have a light and heavy engine models. the light model can process more than 30FPS and the heavy one can process only 10FPS.
The measuring of FPS is done in the nvstreammux and fakesink element by enable_perf_measurement function:
fps_pad1 = gst_element_get_static_pad(streammux, "src");
enable_perf_measurement(&perf_struct1, fps_pad1,
num_sources,
perf_measurement_interval_sec,
perf_cb);
fps_pad2 = gst_element_get_static_pad(fakesink, "sink");
enable_perf_measurement(&perf_struct2, fps_pad2,
num_sources,
perf_measurement_interval_sec,
perf_cb);
In case of the light engine model in the nvinfer element, both values of FPS in the nvstreammux and fakesink are 30FPS and it is fine because I assumed that. However, if I use heavy model in the nvinfer element, both FPS values are only 10FPS. And it is weird because I assumed that the first FPS measured in the nvstreammux will still have 30FPS and the pending frames will be stored in queue1 element waiting to be processed. But in the queue1 element no frames are queued, beacuase the current number of buffers of the queue1 is 1 all the time. This means that the pending frames must be queued somewhere else, for example in the uridecodebin element.
So I explored the uridecodebin element and I found following elements there:
source
- manager
- rtpsession0
- rtpssrcdemux0
- funnel0
- funnel1
- rtpstorage0
- rtpptdemux0
- rtpjitterbuffer0
decodebin0
- rtph264depay0
- h264parse0
- capsfilter0
- nvv4l2decoder0
Instead of measuring the FPS in the nvstreammux I tried the elements in the decodebin such as nvv4l2decoder, h264parse, rtph264depay but all these gave me the same 10 FPS. I also tried to explore elemets in the manager, but there are no frames but packets, so it is difficult.
Do you have any idea where the pending frames or packets are stored when the RTSP stream has 30 FPS but nvinfer can process only 10 FPS? How can I measure the original FPS of the RTSP stream which is not affected by nvinfer element?