TensorRT builder->setMaxBatchSize(maxBatchSize); question

I’ve seen in the TensorRT developer guide document that there is a:


with explanation:
‣ maxBatchSize is the size for which the engine will be tuned. At execution time, smaller batches may be used, but not larger.

But I am not quite clear of this parameter, could anyone help me to clarify this parameter?

What does BatchSize here means? Is it the same meaning of batch size in deep neural work training? Or other meanings?


It is just as the batch size you used in training.

TensorRT will create an inference engine when initial.
The maximal batch size is required for allocating the memory of network.

Once the batch size is given, you can launch the engine with the batch size <= given.

And actually the question is, for example, for the task of image classification, I need to classify quicker the images on the TX2. I set maxBatchSize=8, will it be much quicker than maxBatchSize=1?
If not, what exactly is this maxBatchSize used for? Because if I understand right, Batch Size is only useful for stochastic gradient descent in training, not for inference or real running on TX2?

Is it possible to give for example 8 images at one time and return a vector of results? I mean “batch inference” in imageNet?


The input of TensorRT is in NHWC format.
N indicates the batch size of a network.

For example, N=8 means classify eight images with one execute() code.
Speed is batch=1 > batch=8 > batch=1 x 8 (fast -> slow)

So if you need to classify eight images at a time (Ex, from 8 different input stream), you can launch TensorRT with batch=8 instead of calling eight times of batch=1 to have better performance.


I would like to know ,When I set the maxBatchSize to 8, what is the corresponding output?
for input (3,224,224),maxBatchSize set to 1, output is (1,101)
for input (3,224,224),maxBatchSize set to 8, my output is still (1,101),but why not (8,101)?


Hi, longzhu_71

It looks like you already file another topic:

Let’s track this on the new topic directly.

So, if the training data is 1 million images, is it like the training will occur in batches of 8(which is done in parallel )?
New to Tensorrt. Please help.


TensorRT is good for inference but not recommended to be used for training.
So you will need to check the training framework of your model for the detail.

For example, you can set batchsize to 64, 128, 256, … in TensorFlow: