Reducing nvv4l2decoder load for H.264 stream

• Hardware Platform : dGPU (Nvidia A30)
• DeepStream Version 6.0
• TensorRT Version 8.0
• NVIDIA GPU Driver Version : 470
• Issue Type: question

I am using nvv4l2decoder to parse the input H.264 buffers coming from 32 channels in Deepstream 6.0. The input stream frame rate is ~30FPS per channel, but nvv4l2decoder can only decode up to ~26FPS per channel. Setting drop-frame-rate to a value other than 0 does not help because dropping happens only after fully decoding all of the frames in the input.

Futhermore, skipping frames before they enter nvv4l2decoder does not help either, because removing P frames from H.264 affects other decoded frames in the stream.

I have tried setting the parameters of the decoder such as num_extra_surfaces without any improvement in the decoder speed.

I do not have access to the provider of input stream and cannot change the frame rate. Is there any way to tackle the decoder bottleneck ?

Thank you.

What is your stream’s resolution, profile and level?

I don’t have complete information on the camera specifications, but most of the input sources have 1920x1080 resolution and an average of ~4000kbps.

Please refer to our performance data Performance — DeepStream 6.0.1 Release documentation

You can use “nvidia-smi dmon” to monitor the GPU and codec usage during you run your case.

Hi @bolat.ashim I am experiencing exactly your same issue. As of now, I haven’t found any solution besides using the CPU to decode a number of video streams, but the FPS on CPU is extremely low. I think we would need to find a way to drop packages before sending them to the decoder. That being said, decoding a frames usually requires data from previous and/or next frames, so I am not sure data can be dropped. For sure, dropping random data would result in corrupted images. I am desperately looking for a solution, but I am not sure there could be one. Please, let me know if you find anything.

