Inference speed of ONNX vs. ONNX + TensorRT

foreveronehundred · January 9, 2023, 3:21am

Description

Hi,

I want to know if there is any benchmark for comparing the inference speed of ONNX model and ONNX + TensorRT (build engine).

We know that ONNX has done some optimization to the inference speed, so I am curious about how much improve can TensorRT do.

foreveronehundred · January 9, 2023, 3:47am

Here is a following question. I want to know which one we expect to get better inference speed?

We assume both of the above situations are tuned with optimal setting.

spolisetty · January 16, 2023, 6:38am

Which GPU architecture are you interested in? Numbers can vary greatly depending on arch.

Thank you.

foreveronehundred · January 16, 2023, 10:41am

I am interested in A100 80GB. Thanks.