In my exp, i found that tensorRT inference engine that initialized with bigger max_batch_size is slower than the engine that initialized with smaller max_batch_size.
For example, i have two engine, one initialized with max_batch_size is 32, we call it engine_a, the other one initialized with max_batch_size is 1, we call it engine_b. In my exp, both engine use batch_size is 1. the engine_a’s fps(frame per second) is 113, but the engine_b’s 125, which mean engine_a is slow 10% than engine_b while their are the same but the max_batch_size setting.
Is this result normal?
In my application, the batch_size is uncertain, i usually can set the max_batch_size very big(eg. 32), but my exp show that the engine will be slower, so it is not a good idea。