Hello everyone. In TRT there is max batch size parameter, which to my understanding controls how many datapoints are going to be inferred at once during the inference stage, right? So when I look at logs I notice that there is also a different parameter, namely number of batches.
[TensorRT] INFO: Detecting input data format
[TensorRT] INFO: Dectected data format LCHW
[TensorRT] INFO: Verifying data format is uniform accross all input layers
[TensorRT] INFO: Verifying batches are the expected data type
[TensorRT] INFO: Executing inference
[TensorRT] INFO: Number of Batches: 1
[TensorRT] INFO: Execution batch size: 100
Is there any way to set this parameter to a different value than 1? What if I wanted to do 10 batches of size 10 instead of 1 batch of size 100? The most obvious way is to use the for loop, but that doesn’t seem very tensorrt-ish to me. Also, when I use the for loop with 10 iterations and with 100 iterations I get different results (I take the mean, not the total time), with the latter they are about 10% better when it comes to frames per second. I hope that setting this number of batches parameters directly in trt inference engine would settle this issue for me, but I can’t find any information about it. Is there any way?