sdk documentation says, when working with dynamic shapes, batch dimension must be explicit, but I don’t know the actual batch size because it’s changing with different case. If I set batch size to max, I worry about it will affect efficiency, could somebody give me some suggestions ?
Just set IBuilder::createNetworkV2(1U <<static_cast(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH)) before you load nerwork.
Hi,
In this case you can try using optimization profiles.
You should be able to create an engine with many profiles optimized for either one specific batch size each, or for a range of batch sizes.
I believe the performance increases as the range decreases.
Eg:
Specific Batch Sizes (optimized for each batch size):
Profile 1: min=opt=max=(1, *input_shape)
Profile 2: min=opt=max=(8, *input_shape)
Profile 3: min=opt=max=(32, *input_shape)
Thanks
Hi,
Thanks for your reply.
But the range of batch size is big, from 0 to 200, the range of input sequence length is also big, from 0 to 1000.
Thanks
Hi,
In that case, for every new input dimension, new TRT engine file will be generated at runtime when dynamic mode is used.
Could you please follow below steps and let us know in case you are getting any error?
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-700/tensorrt-developer-guide/index.html#work_dynamic_shapes
Thanks
Ok, I will try, thank you!