TensorRT inference time extremely slow

Description

The inference time of the TRT engine is incredibly high

Environment

TensorRT Version: TensorRT v8.2
GPU Type: NVIDIA A10G
Nvidia Driver Version: 515.65.01
CUDA Version: 11.7
CUDNN Version:
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
report-nsys (78.0 KB)

The model I have produces these statistics for inference, but there seem to be 2 operations taking ~98% of the time, i.e. TensorRT:ExecutionContext::enqueue and TensorRT:ConvTranspose_114. The inference time is of about 1000ms which is incredibly higher then the pytorch inference.

Hi,

Could you please try on the latest TensorRT version 8.5.2 and let us know if you still face this issue.
Please share with us the minimal issue repro script and model for better debugging,

Thank you.