I am using xavier agx 32GB for running inference.The network is a single googlenet with 500x500 inputs.I am using trt7.
While creating trt engine using ONNX,I export the model with explicit shape dimension i.e. [N,H,C,W].I am able to run inference on different batch sizes if they are equal to or less than N(for which onnx was exported).All this while the builder.max_batch_size flag is set to 1.
Even if I set the above flag to N,there is no speedup.Moreover the documentation states that
The maximum batch size which can be used at execution time, and also the batch size for which the
ICudaEngine
will be optimized.
Is the documentation incorrect?I should not be able to run bs>1 if the flag is set to 1.Is this flag rendered meaningless while exporting from ONNX with explicit batch?