Pulling hundreds of monitoring channels at the same time, the output results do not match

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
GPU
• DeepStream Version
6.3.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
8.5
• NVIDIA GPU Driver Version (valid for GPU only)
535.54.03
• Issue Type( questions, new requirements, bugs)
questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I set it to read-only keyframes and pull 85 cameras at the same time through rtsp. The camera’s keyframe is 50 and the FPS is 25. In theory it should read 85 images every two seconds, but I found through the logs that many cameras took a long time to start outputting the first image. Even the number of subsequent input images is not as expected. Why is this?

When I accessed 85 cameras via rtsp to read keyframes simultaneously, I executed the nload -u M command to monitor the traffic. The average is about 34M/s.

Or is there any good way to judge whether the program is executing and pulling 85 cameras at the same time?I always feel like this program isn’t working as expected

How did you pull the 85 streams? Are the cameras all TCP connections?

  1. Have you meassured the processing time of the probe function? The probe function is a blocking callback which will block the whole pipeline.

  2. Have you monitored the latency of the streams in rtpjitterbuffer rtpjitterbuffer? We are not sure whether the frames of the 85 streams reaches to the RTSP client(reciever) at the same time.

Thank you for your reply. I have never tested these two points. I am still a newbie and don’t know how to test. I saw the second point you gave a hyperlink. How to test the probe processing time?

Please make sure you are familiar with GStreamer knowledge and coding skills before you start with DeepStream. The rtpjitterbuffer is in the open source GStreamer project, please refer to the GStreamer resources.

BTW, have you confirmed your GPU can decoding 85 streams? Video Codec SDK | NVIDIA Developer

Okay, I’ll check out the relevant documentation

I’m very sorry, I haven’t understood the document you gave above, but I saw some solutions for optimizing deepstream performance in the official tutorials and tried them (scheme 11 has not been tried yet), but it has no effect. When running the python program, 40 monitoring channels were connected through RTSP, and only key frames were obtained. At this time, the average network speed is 20M/s, the utilization rate of the hardware decoder is only about 2%, and the GPU utilization rate is about 3%. At this time, the number of output results and the number of cameras are basically the same, but if more are connected The number of output results of the camera is much smaller. I feel that the bottleneck is not the hardware, but the network? Or code? Can you help me see if there is something wrong with my code? Thank you so much

image-20231206144738697


The “batched-push-timeout” property of nvstreammux is millisecond. Please make sure you have set the value properly.

Have you measured the time of “filter1_src_pad_buffer_probe”?

According to your code, there is only one rtsp uri address.

Thanks for your quick reply

  1. the fps of the camera is 25
  2. How to measure it? By printing timestamp?
  3. My rtsp is passed by calling the api. In order to facilitate your debugging and viewing, I specially modified it into the main function. Do you have more rtsp for testing?

I observed that the number of logs printed by “logging.info(f"image_put_path: {cv_type}–{img_path}”)" when reading key frames in the “filter1_src_pad_buffer_probe” method is inconsistent with the number of cameras

Please refer to DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums for the nvstreammux property and parameters settings.

Okay, thank you for the link. Can you help me connect more rtsp and try to see if my code can work normally? This problem is causing me a lot of trouble and I may not be able to solve it on my own

You set the nvstreammux “batched-push-timeout” property as 0.04ms by the code " streammux.set_property(‘batched-push-timeout’, 1 / 25)", so the frame number in the batch may not equal to the nvstreammux batch size.

Thanks for your reply. I set up ‘streammux.set_property(‘batched-push-timeout’, 1000000 / 25)’ as suggested by your link, but the output number is still inconsistent with the access number. For example, the fps of the camera is 25, The i-frame is 50. I want to read only the key frame output. If I want to output the results of 100 cameras in one batch, what should this attribute be set to?

Are there any other problems in the code? If there is no problem with the code, will the bottleneck be the network bandwidth? If it is a network problem, how to test it? Currently hundreds of cameras in the park are working normally, please help me, I I have tried many methods to no avail, and I am very distressed

I set up ‘streammux.set_property(‘batched-push-timeout’, 50000 / 25)’, which means pushing a wave of data in 2 seconds? At the same time, I accessed 84 channels of monitoring through rtsp, but I passed the traffic monitoring command’ nload -u M’, the average bandwidth is 30M/s, and 84 1080p images are streamed every two seconds. I think there is something wrong with this speed

The CPU loading and GPU loading are fine. The network issue is possible.