The default value of engine.max_batch_size is 32?

Hi all,

Now I have a trt engine which is converted from onnx2trt.
When I load this engine and directly see what its max_batch_size is, it shows 32.
However, I just wanna to test only one image, and I cannot set the engine.max_batch_size value. (Even I already set max_batch_size is 1, but the value of what I print seems like different between engine.max_batch_size and max_batch_size)

As the engine.max_batch_size is 32, it will create a wrong buffer during the allocate_buffers(engine) stage.

In the infer() stage, there is a step below:

np.copyto(self.inputs[0].host, img.ravel())

The output is

self.inputs[0].host 88473600
img.ravel() 2764800

Because of the engine.max_batch_size 32, we can know 32*2764800 = 88473600.
It makes me wrong on here.

See :

def load_engine(trt_runtime, engine_path):
    with open(engine_path, 'rb') as f:
        engine_data =
    engine = trt_runtime.deserialize_cuda_engine(engine_data)
    return engine


Engine.max_batch_size 32

I have some questions for this thing.

  1. Why is the default of engine.max_batch_size 32?
  2. How to setup the engine.max_batch_size? (Not normal max_batch_size)

Thanks in advance!

o Linux distro ; Ubuntu 18.04
o GPU type : 1060
o Nvidia driver version : 440
o CUDA version : 10.0
o CUDNN version : 7.6.5
o Python version [if using python] : 3.6.9
o Tensorflow and PyTorch version : TF 1.14
o TensorRT version :


Default max batch size in onnx2trt is 32. Please refer below link:

You can use either -b option to generate engine with different max batch size or you can use directly TRT APIs to set the max batch size, please refer below link:


1 Like


So If I use the onnx parser to generate the trt engine, it should not meet this problem, right?