Large time delay by chance in TensorRT8.2

zhujiandream · April 12, 2022, 7:44am

Description

Dear my frined，

   Now  I am using TesnorRT to run  face  detection & extraction models to achieve 1:1 compare service.

   I  do the  pressure test with  million picture pairs，  to  check the stability of the 1：1 face service.

   The  test take  more than a hour,   and  I found  a problem:   in most time,  model inference achieves in  less than 100ms.
   But  by chance,   it might  took  more then   5 seconds  for the TensorRT  to achiece  1:1 face compare .   It  happends  around  100 times   in  million 

   Pls kindly  help ,   it  is  a  normal  performance  for TensorRT ?   What achieve this  big time delay

   Ths a  lot.

Environment

TensorRT Version: 8.2
GPU Type: V100
CUDA Version: CUDA 11

NVES · April 12, 2022, 8:07am

Hi, Please refer to the below links to perform inference in INT8

Thanks!

zhujiandream · April 12, 2022, 8:20am

Dear my friend, I check your documents, there is the average performance of TensorRT, the result is good.

During my test, the average time cost for TensorRT inference is as good as your document.

But my question is the suddenly large time delay, It happens randomly, I don’t konw how to improve it.

spolisetty · April 13, 2022, 10:50am

Hi,

We recommend you to please try the latest TensoRT version 8.4 and if you still face this issue, please share with us the minimal issue repro ONNX model and scripts to try from our end for better debugging.
https://developer.nvidia.com/nvidia-tensorrt-8x-download

Thank you.

zhujiandream · April 18, 2022, 9:56am

Dear my friends，

    I reboot my GPU  server computer， and this issue is ok now。  Thanks a lot！

Topic		Replies	Views
High inference time while running UNet with INT8 precision TensorRT tensorrt	5	978	February 10, 2021
TensorRT inference time much slower than cuDNN TensorRT	3	2017	October 12, 2021
use int8 the result would be 0.1% differ TensorRT	3	644	June 19, 2019
TensorRt inference is taking 1.5 sec to inference a single frame.i want to speed up my inference.How can i do that TensorRT tensorrt , cuda , jetson-nano	3	755	March 13, 2023
The inference time? TensorRT tensorrt	0	60	July 13, 2024
ONNX runtime prediction using GPU and with different intervals TensorRT	4	1972	January 19, 2022
TensorRT inference time extremely slow TensorRT	1	449	January 31, 2023
TensorRT --- non-int8 fallback when trying to calibrate ONNX model DeepStream SDK tensorrt , deepstream	11	430	July 1, 2024
Inference Speed Spikes When Running FP16 Converted ONNX Model with TensorRT TensorRT cudnn	1	42	January 31, 2025
Why the inference time of TensorRT enqueuev2 goes up gradually? TensorRT	1	444	December 31, 2023

Large time delay by chance in TensorRT8.2

Description

Environment

Related topics