Builder.max_batch_size no effect

jasdeepchhabra94 · October 24, 2020, 6:00pm

I am using xavier agx 32GB for running inference.The network is a single googlenet with 500x500 inputs.I am using trt7.

While creating trt engine using ONNX,I export the model with explicit shape dimension i.e. [N,H,C,W].I am able to run inference on different batch sizes if they are equal to or less than N(for which onnx was exported).All this while the builder.max_batch_size flag is set to 1.

Even if I set the above flag to N,there is no speedup.Moreover the documentation states that

The maximum batch size which can be used at execution time, and also the batch size for which the ICudaEngine will be optimized.

Is the documentation incorrect?I should not be able to run bs>1 if the flag is set to 1.Is this flag rendered meaningless while exporting from ONNX with explicit batch?

AakankshaS · October 26, 2020, 5:43pm

Hi @jasdeepchhabra94
You are using explicit batch size. That means you are building the network, and running it, for that size only. If you want to vary the shape at runtime, you may need to either use implicit batch mode or use dynamic shape networks with execution profile.

Thanks!