Description
In implicit batch mode, there is a parameter “maxBatchSize” in tensorRT engine. We can set it via IBuilder::setMaxBatchSize and my plugin calculate needed workspace size in function IPluginV2::getWorkspaceSize. TensorRT make sure that the argument batchSize pass through IPluginV2::enqueue, is less than maxBatchSize.
However, in explicit batch mode, parameter batchSize in IExecutionContext::enqueue has no effect (at least since TensorRT8.5.3). And IPluginV2::enqueue can get a argument batchSize greater than maxBatchSize.
This is the problem. workspaceSize is calculated based on maxBatchSize, but when it is actually executed, no one can guarantee that batchSize < maxBatchSize (maybe there is a way I don’t know) so there is a risk of overflow.
My question is, as a plugin author, how do you apply for a large enough workspace when getWorkSpaceSize only has one parameter, maxBatchSize(until TensorRT 10.7.0)? I’m glad to hear your suggestions.