Is there a way to balance the load for Nvinferserver when inferencing with intervals

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): GPU A10
• DeepStream Version 7.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only) 535.161.08
• Issue Type( questions, new requirements, bugs) Questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi, I am trying to build pipelines that hold more than 100 streams with resolutions 1080p and format H264/265, the inference server is nvinferserver with Triton 23.12 and the detection model is yolov5s (already converted to TensorRT .plan model for performance). Currently the pipeline works perfectly on 5 streams (25fps) with nvinferserver interval set to 0, the whole pipeline is modified from deepstream-rtsp-in-rtsp-out for result visualization.

However, switching to 100 streams with nvinferserver interval set to 24 brings me problems: both the probe showing frame number and the output rtsp stream shows a slight stall (stuck) every other interval, which, in my opinion, is not a good sign that the pipleline is working well. I tried to reduce the number of streams and it seems that 50 is the maximum number of streams that the pipeline can deliver a “fluent” output.

Through a series of tests I guess the reason is that for nvinferserver it waits until the end of the interval to send all data to triton server, causing a spiking in the workload of triton server, leading to slower inference speed and the data failing to catch up with the other frames flowing in the pipeline, the workload for CPU is less than 10% and GPU 6GB memory and less than 50% volatile GPU utils. I am wondering is there a way to averaging the workload in the single pipeline?

According to the A10 video decoder spec, at most 37 x 1080p@30fps H264 or 81 x 1080p@30fps HEVC streams can be supported. I don’t think you can decode 100 H264 or H265 1080p@30fps streams in A10. Video Codec SDK | NVIDIA Developer

You can monitor the GPU loading to confirm this. The command is “nvidia-smi dmon”.

Yes, you are correct, thanks for your help, it is really important information, will pay more attention to that.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.