Description
Hi,
I want to know if there is any benchmark for comparing the inference speed of ONNX model and ONNX + TensorRT (build engine).
We know that ONNX has done some optimization to the inference speed, so I am curious about how much improve can TensorRT do.
Here is a following question. I want to know which one we expect to get better inference speed?
- PyTorch model optimized with Torch-TensorRT
- PyTorch model → ONNX model, then optimize the ONNX with TensorRT
We assume both of the above situations are tuned with optimal setting.
1 Like
Hi @foreveronehundred,
Which GPU architecture are you interested in? Numbers can vary greatly depending on arch.
Thank you.
I am interested in A100 80GB. Thanks.