Now I have a trt engine which is converted from onnx2trt.
When I load this engine and directly see what its max_batch_size is, it shows 32.
However, I just wanna to test only one image, and I cannot set the engine.max_batch_size value. (Even I already set max_batch_size is 1, but the value of what I print seems like different between engine.max_batch_size and max_batch_size)
As the engine.max_batch_size is
32, it will create a wrong buffer during the
In the infer() stage, there is a step below:
The output is
self.inputs.host 88473600 img.ravel() 2764800
Because of the engine.max_batch_size 32, we can know
32*2764800 = 88473600.
It makes me wrong on here.
def load_engine(trt_runtime, engine_path): with open(engine_path, 'rb') as f: engine_data = f.read() engine = trt_runtime.deserialize_cuda_engine(engine_data) print("Engine.max_batch_size",engine.max_batch_size) return engine
I have some questions for this thing.
- Why is the default of engine.max_batch_size 32?
- How to setup the engine.max_batch_size? (Not normal max_batch_size)
Thanks in advance!