Inference time when max batch size is smaller than when smaller batch size

• Hardware Platform (GPU)
• DeepStream Version 6.3
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions)

Hi everyone, I created a pipeline to detect face with yolov5-face. I converted dynamic .onnx to dynamic .engine by using trtexec with --maxShapes=input:30x3x640x640 (no minShapes, no optShapes).
When I run the pipeline with 30 identical videos at the same time, the total time to inference in batch-size=10 is 448359 milliseconds, batch-size=30 is 222725 milliseconds. I modified --maxShapes and run with different batch-size, the smaller batch size took more time to inference than when batch-size = max batch size.

My question:

  • Does deepstream use padding methods in batch to fill when batch size equals max batch size?
  • If not, is there any possibility that could cause this issue?

Thank you!

Partially yes. Please set --optShapes for the most frequently used batch size of your case.

There are several elements have “batch-size” setting, which element do you mean? nvstreammux or nvinfer?