TensorRT inference time extremely slow

Leonardo-MAI · January 31, 2023, 10:43am

Description

The inference time of the TRT engine is incredibly high

Environment

TensorRT Version: TensorRT v8.2
GPU Type: NVIDIA A10G
Nvidia Driver Version: 515.65.01
CUDA Version: 11.7
CUDNN Version:
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
report-nsys (78.0 KB)

The model I have produces these statistics for inference, but there seem to be 2 operations taking ~98% of the time, i.e. TensorRT:ExecutionContext::enqueue and TensorRT:ConvTranspose_114. The inference time is of about 1000ms which is incredibly higher then the pytorch inference.

spolisetty · January 31, 2023, 11:54am

Hi,

Could you please try on the latest TensorRT version 8.5.2 and let us know if you still face this issue.
Please share with us the minimal issue repro script and model for better debugging,

Thank you.

Topic		Replies	Views
TensorRT inference time much slower than cuDNN TensorRT	3	2022	October 12, 2021
inference time of tensorrt is slower than tensorflow !!! TensorRT	2	1440	September 27, 2019
Tensorrt inference slower than tensorflow TensorRT	3	490	November 27, 2020
Tensorrt is slower than pytorch TensorRT	2	2246	September 15, 2021
TensorRT inference time issues with different driver version TensorRT	1	394	September 20, 2023
Inference on large batch size TensorRT	5	4609	September 21, 2018
TensorRT inference time much faster than cuDNN TensorRT	5	1660	February 22, 2022
The first inference using tensorRT model takes far longer time than that using tensorflow model TensorRT	0	662	November 13, 2020
Inference time mismatch between same configuration on Windows and Ubuntu TensorRT tensorrt , windows-driver	2	664	September 27, 2023
TensorRT Inference is Slower Than Other Frameworks TensorRT	7	3729	December 9, 2019

TensorRT inference time extremely slow

Description

Environment

Related topics