Building a engine takes too long

Description

Hi! I am trying to build yolov7 by compiling it and saving the serialzed trt engine.
However, the process is too slow. Takes 1hour for 256*256 resolution.
Is there anyway to speed up?

Environment

TensorRT Version: 8.2.4.2
GPU Type: RTX3080 12GB
Nvidia Driver Version: 515.48
CUDA Version: 11.4
CUDNN Version: 8.2.4
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.12.0
Baremetal or Container (if container which image + tag):

Relevant Files

here is my onnx file.
https://drive.google.com/file/d/1RnAMCGl-Grft9TDgkn5lcPNvYulIRaPI/view?usp=sharing

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Hi,

We Could not observe similar behavior.

You can try increasing the GPU memory utilization using --workspace option and please refer Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

Also, we recommend you to please try on the latest TensorRT version 8.4 GA Update 1 and if you still face this issue share with us verbose logs and command to try from our end for better debugging.
https://developer.nvidia.com/nvidia-tensorrt-8x-download

Thank you.

Here is my log.

https://drive.google.com/file/d/1TXY86Rbk6Fr4CbNkN1ySMrK7mahkYD7t/view?usp=sharing

It starts at 15:51:28 and ends at 16:10:53.
I also set the workspace large enough.
What is the problem?

Hi,

Looks like you’re not using trtexec, could you please share with us the issue repro and if possible trtexec --verbose logs.
Also, we recommend you to please try official TensorRT samples for better performance.

Thank you.

Yes, I’m experiencing the same thing, running trtexec on yolov7, and it just takes absolutely forever. In fact it’s been running and hasn’t finish after >1hr. This is on a Jetson Xavier dev kit.

Hmm, looks like stopping and reloading the trtexec process worked for me, and it rendered an engine 1280x768 in about 20-30 minutes. Still feels a lot slower than older versions, in the previous Jetpack release (4.4), it would take maybe 1-2 minutes to create an engine.

Here is trtexec logs.
It’s still slow…

could you possibly share trtexec log?
I want to compare it to my log.

Hi,

We were trying on a different platform. As you’re using Jetson Xavier, we are moving this post to the Jetson Xavier forum to get better help.

Thank you.

No. I am using the x86 platform.
It is also recorded in my trtexec log.

I’ve been having much better luck now exporting YOLOv7 models to TensorRT.

Here is a sample log, this one was created using the Python API on an RTX 3090
https://gist.githubusercontent.com/jakepoz/123eb801fa6ae92126403a59ad73eccd/raw/38f21a7f5e129dfee8b8fd668d3a38d9b5ae9259/trtlog.txt

I’m not 100% sure what changed though, after I ran it a few times, the jobs started completing.

Hello,I meet the similiar problem.I use the version8.5 of TensorRT to convert the yolov4 onnx to engine model.It still takes 42mins and does not decline.
So I want to know how did you decrease the time of converting the model?

I haven’t solved it yet.
What is certain is that the build speed was faster on the gtx 1060.
After replacing the GPU(1060->3080), it became several times slower.