How to accelrate build engine time?

Description

Our customers have different devices like 1080TI, 2080TI, 3080, Jetson AGX… and I want to know how to shortened the build engine time.
I’m upgrading my TRT version from 7 to 8 and I found it have a new feature call timing cache. Does this feature means I can use this cache across different devices like gen cache on 2080TI and use it on 3080? If not, what is the best method to shortened the build engine time?

Environment

TensorRT Version: 8.4.1.5
GPU Type: 1080TI, 2080TI, 3080, Jetson AGX
Nvidia Driver Version: 516.40
CUDA Version: 11.1
CUDNN Version: 8
Operating System + Version: windows 10 + ubuntu 20.04 + Jetson
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Steps To Reproduce

Hi ,
We recommend you to check the supported features from the below link.

You can refer below link for all the supported operators list.
For unsupported operators, you need to create a custom plugin to support the operation

Thanks!

Hi,
My model is support for all device like 1080, 2080,3080. I remember that it could not share the same plan file for different compute capability, right?
We give customer our onnx models and they gen the plan file by their devices. Now, they think it takes a long time to generate the plan file and I found maybe timing cache can solve this question, is it?
If not, do you have any idea to solve this problem?

Thanks!

Hi,

Yes, currently we have a timing cache, and please try increasing the workspace.

Thank you.

Hi @spolisetty ,
If I want to shortend build engine time for fp16, just turn on timing cache and workspace, is it?
If so, how big would you recommend increasing the workspace?

Thanks.

Yes, workspace you can allocate based on your GPU memory available.