The first inference using tensorRT model takes far longer time than that using tensorflow model

cimz · November 13, 2020, 7:41am

Description

When using tensorRT, the first inference time is 126073.25 ms. The inference time is about 25 ms from the second inference.
When using tensorflow, the first inference time is 2292.62 ms. The inference time is about 35 ms from the second inference.

I attached the 3 log files in the below link - tf_log.txt, trt_log.txt, and trt_log_trimmed.txt.
The trt_log_trimmed.txt is a part of the trt_log.txt. I trimmed the contents to compare the inference time with tf_log.

I think the first inference time is too long. So, I wonder how to reduce the first inference time using tensorRT model.

My settings is as below:

Environment

TensorRT Version:
libnvinfer-dev: 6.0.1-1+cuda10.1
libnvinfer-plugin6: 6.0.1-1+cuda10.1
libnvinfer5: 5.1.5-1+cuda10.0
libnvinfer6: 6.0.1-1+cuda10.1

GPU Type: 2080ti
Nvidia Driver Version: 455.23.05
CUDA Version: 10.1.243
CUDNN Version: 7.6.5
Operating System + Version: Ubuntu 16.04
Python Version (if applicable): 3.8.6
TensorFlow Version (if applicable): tensorflow-gpu==2.3.0rc0
PyTorch Version (if applicable): none
Baremetal or Container (if container which image + tag): none

Relevant Files

See the files from the link below.

Steps To Reproduce

Download the code from tensorrt_test - Google Drive
Using tensorflow model: python3 test.py --framework tf
Using tensor RT model: python3 test.py --framework trt

Topic		Replies	Views
Slow first inference and very slow two models inference TensorRT	3	1238	August 2, 2022
inference time of tensorrt is slower than tensorflow !!! TensorRT	2	1433	September 27, 2019
the first inference speed is so slow TensorRT	2	1685	April 3, 2021
TensorRT inference Time TensorRT	1	758	September 20, 2018
Tensorrt inference slower than tensorflow TensorRT	3	484	November 27, 2020
TensorRT inference time extremely slow TensorRT	1	449	January 31, 2023
Inference time is more for TensorRT engine than Pytorch model for Retinaface TensorRT tensorrt , pytorch	1	1047	June 4, 2021
Inference time of tensorrt 6.3 is slower than tensorrt 6.0 TensorRT tensorrt , driveos	7	915	October 12, 2021
Converting model to TensorRT makes the first prediction a lot slower TensorRT	4	411	December 11, 2023
Tensorrt cold start (First time inference) TensorRT tensorrt , cuda , ubuntu , python	2	290	May 30, 2024

The first inference using tensorRT model takes far longer time than that using tensorflow model

Description

Environment

Relevant Files

Steps To Reproduce

Related topics