This is a duplicated topic originally posted in the wrong category. The new topic is here: Replicate 2.2ms inference time on BERT - TensorRT - NVIDIA Developer Forums
Please delete the topic if possible.
This is a duplicated topic originally posted in the wrong category. The new topic is here: Replicate 2.2ms inference time on BERT - TensorRT - NVIDIA Developer Forums
Please delete the topic if possible.
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Replicate 2.2ms inference time on BERT | 3 | 853 | October 12, 2021 | |
NVIDIA Slashes BERT Training and Inference Times | 0 | 281 | August 21, 2022 | |
Inference Benchmarks - TensorRT Version ? | 1 | 2044 | October 4, 2018 | |
TensorRT 8.0: What’s New | 2 | 1149 | July 20, 2021 | |
Inference time? | 3 | 475 | October 10, 2021 | |
Inference time using TF-TRT is the same as Native Tensorflow for Object Detection Models | 4 | 1015 | March 31, 2022 | |
NVIDIA Announces TensorRT 8 Slashing BERT-Large Inference Down to 1 Millisecond | 0 | 452 | July 20, 2021 | |
TensorRT inference time extremely slow | 1 | 459 | January 31, 2023 | |
Faster inference in tensorrt model | 1 | 399 | April 3, 2023 | |
High inference time while running UNet with INT8 precision | 5 | 998 | February 10, 2021 |