TensorRT Inference is Slower Than Other Frameworks

orcdnz · November 26, 2019, 5:02pm

I just created a TensorRT YoloV3 engine. However inference time does not make any significant difference. When I run the same model with PyTorch I get 20 FPS but TRT inference only yields around 10 FPS. I work on my notebook’s GTX 1050 Max-Q with CUDA 10. Only warning I got end to end (from converting yolo to engine, to inference) is ;

[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.4.2

I couldnt know what type of info should I add more so that’s all for now. I’ll be willing to add the more info on demand.

SunilJB · November 27, 2019, 10:46am

Hi,

Can you provide the following information so we can better help?
Provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow and PyTorch version
o TensorRT version

Also, if possible please share the script & model file to reproduce the issue.

Thanks

orcdnz · November 27, 2019, 11:05am

Ubuntu 18.04.3 LTS
GeForce GTX 1050 with Max-Q Design/PCIe/SSE2 Driver Version 440.26
CUDA 10
CUDNN 7.4.2
Python 3.6.8
Tensorflow-GPU 1.14.0
Torch 1.3.1
Torchvision 0.4.2
TensorRT 6.0.1.5

Model is from https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg

And the scripts I used is from GitHub - penolove/yolov3-tensorrt .
Repo might not work right away when you clone but minor edits whitout the need of changing the core of it will help to get it working.

SunilJB · November 29, 2019, 4:20am

Hi,
Couple of recommendations:

Warning seems to be due to older version of cuDNN. Could you please try upgrading the cuDNN version to 7.6.5?
Please refer to below support matrix:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-601/tensorrt-support-matrix/index.html
ONNX Parser isn’t currently compatible with the ONNX models exported from Pytorch 1.3 - If you downgrade to Pytorch 1.2, this issue should go away.

Thanks

Kuonangzhe · December 8, 2019, 2:33am

Hi, I just met a similar problem. I used python samples of tensorrt for yolov3, with onnx parser of version 1.4.1. I successfully run the sample program, and get 17 FPS on pure inference time (data → gpu + model inference) without postprocess. Then I switch to MXNet gluoncv’s version of yolov3, which is the same for darknet53_coco, and I also get around 17 FPS with only network inference. The postprocess of trt sample is too slow in python and will make its FPS down to 3, so I didn’t calculate it.

Here are my versions:
PC: Intel® Core™ i7-7700HQ CPU @ 2.80GHz × 8 + GeForce GTX 1060/PCIe/SSE2
Ubuntu 18.04
CUDA 10.1
Cudnn: 7.5.0
MXNet: 1.5.1
TensorRT: 5.1.5.0
yolov3 input size: 608 x 608
python: 3.6.8
data type: float32

Are there any suggestions? Or is it normal to have similar FPS? I already got over 2x speed up on simple resnet18 and resnet50 with test on trt with FP32.

SunilJB · December 9, 2019, 4:46am

Hi,

TensorRT is a high performance neural network inference optimizer and runtime engine.
The pre and post-processing steps depend strongly on the particular application.

Please refer to below link for optimizing python performance.
https://docs.nvidia.com/deeplearning/sdk/tensorrt-best-practices/index.html#optimize-python

Thanks

Kuonangzhe · December 9, 2019, 6:25am

Thanks for the fast reply, while what I am asking is, is that normal to have similar FPS?

Please be aware that I didn’t consider pre or post process AT ALL in time estimate. Also, it works well in simple Resnet inference with good speed up.

SunilJB · December 9, 2019, 9:23am

Hi,

Can you share the model file and script to reproduce this issue so we can better help?
Meanwhile, please try to use the latest supported TRT version.

Thanks

Topic		Replies	Views
TensorRT model inference is slower than normal model TensorRT tensorrt , cuda , yolo , cudnn	5	1169	August 18, 2020
Yolov4 TensorRT slower than Yolov4 darknet TensorRT	6	3283	September 1, 2020
Inference time of tensorrt 6.3 is slower than tensorrt 6.0 TensorRT tensorrt , driveos	7	910	October 12, 2021
Inference is so slow with torch1.6 Jetson Xavier NX nvbugs , pytorch	12	3522	October 23, 2020
TensorRT inference time extremely slow TensorRT	1	440	January 31, 2023
Yolov6 Slow inference speed on the Nvidia Jetson NX board Jetson Xavier NX yolo	8	1600	August 24, 2022
TensorRT 8 : C++ inference gives different results compared to tensorflow python inference TensorRT	7	1316	October 5, 2021
Memory error for tensorRT model on TX2 Jetson TX2 tensorrt	6	1453	January 5, 2022
Tiny Yolo V3 TensorRT conversion low fps TensorRT	2	850	December 10, 2019
ONNX Model Int64 Weights TensorRT	12	12536	February 17, 2024

TensorRT Inference is Slower Than Other Frameworks

Related Topics