Input batch size is smaller than TensorRT engine batch size

I found that TensorRT model wtih batch-size of 6 can be used to infer an input with batch-size less than such as 4. My question is that would it be more apporiate to use a TensorRT engine with batch-size of 4 to infer on a 4 batch input? What difference does it make?

My setup is the following:

Jetson Xavier
DeepStream 5.0
JetPack 4.4
TensorRT 7.1.3
NVIDIA GPU Driver Version 10.2


Dynamic shape means the dimension can change in a range [min, max].
So here optimization profile is used to tell TensorRT the min/max/opt value to build the engine.

Thank you.