Run to run variation with TensorRT

DML · August 30, 2022, 4:21pm

Description: Run to run variation with TensorRT

Environment:
NVIDIA Release: 22.07
NVIDIA TensorRT Version: 8.4.1
NVIDIA Driver Version: 515.43.04
CUDA Version: 11.7
NVIDIA GPU: NVIDIA Tesla T4
Docker Image: nvcr.io/nvidia/tensorrt:22.07-py3

To get the inference data using trtexec. there are two steps involved.

Build a TRT engines from a model
Get a inference performance metrics by loading TRT engines

I see less than 1% variation once TRT engines are built from a model and perform inference stage multiple times.
If I perform step 1 and step 2 multiple times for the same model with same configs, I see variation up to 3% in inference throughput. Is it normal?

If I build a TRT engine for the same model multiple times (with same configs) then should trtexec generate a TRT engine with same size?

spolisetty · September 2, 2022, 11:17am

Hi,

The builder times kernels to find the fastest, and sometimes if the timings are close between two different precisions, due to timing noise the builder may choose differently on different runs. So 3% variation in runtime is not necessarily unusual, and possibly some variation in engine size.

Thank you.

Topic		Replies	Views
Two TRT compiled engines that were generated from the same Onnx model show different inference average times TensorRT cudnn	2	221	August 11, 2024
Non-deterministic TensorRT engine building TensorRT tensorrt	3	678	March 10, 2021
Question about TensorRT reproducibility on different architectures TensorRT	3	1000	October 12, 2021
TensorRT engines are built so differently with the same IBuilderConfig, how to fix? TensorRT	1	711	September 20, 2021
Performance discrepancy using TensorRT engines TensorRT tensorrt	3	731	October 5, 2021
The same trt engine performs very differently in different programs TensorRT cudnn	0	49	September 9, 2025
TF-TRT, why have to create TensorRT engine every time of inference ? TensorRT	1	1032	December 13, 2019
Deterministic TensorRT optimization TensorRT tensorrt	8	833	October 12, 2021
Is TensorRT inference deterministic/reproducibile? TensorRT tensorrt	5	2894	October 12, 2021
Trtexec generates different engines when using the same platform/machine with the same onnx model TensorRT	3	1249	March 29, 2022

Run to run variation with TensorRT

Related topics