Dynamic Shape Inferencing Slower Than Fixed Shape

I am trying to benchmark BERT on TensorRT (Using demoBERT implementation by NVIDIA).
My data distribution consists of text of varying lengths with more than 90% having just a single word.

For this I am trying to use dynamic shapes using an optimization profile which favours short sentences but is able to take care of larger ones too. (e.g. min (batchsize,1), opt (batchsize,4) , max(batchsize, 128)

I observe that using engine with dynamic shapes inference is slower than using engine with fixed shape for few scenarios. Is this expected?


Yes, it is expected to some extent. The tactic chosen for the optShapes may not be the ideal one for other shapes.

Thank you.

1 Like