Hi! I am trying to build yolov7 by compiling it and saving the serialzed trt engine.
However, the process is too slow. Takes 1hour for 256*256 resolution.
Is there anyway to speed up?
TensorRT Version: 22.214.171.124
GPU Type: RTX3080 12GB
Nvidia Driver Version: 515.48
CUDA Version: 11.4
CUDNN Version: 8.2.4
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.12.0
Baremetal or Container (if container which image + tag):
here is my onnx file.
Steps To Reproduce
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
- validating your model with the below snippet
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
We Could not observe similar behavior.
You can try increasing the GPU memory utilization using
--workspace option and please refer Developer Guide :: NVIDIA Deep Learning TensorRT Documentation
Also, we recommend you to please try on the latest TensorRT version 8.4 GA Update 1 and if you still face this issue share with us verbose logs and command to try from our end for better debugging.