Replicate 2.2ms inference time on BERT

This is a duplicated topic originally posted in the wrong category. The new topic is here: Replicate 2.2ms inference time on BERT - TensorRT - NVIDIA Developer Forums

Please delete the topic if possible.