TensorRT model build time and deployment

Hello again,
this is not related to inference of a model but rather to build time of TRT engine.

For example, if we consider Yolov 5 model using TensorRT , the buildEngineWithConfig takes long time to compile a trt model. Now, it doesn’t matter if the ONNX or CAFFE parser is used, the build time will be similar.
Honestly, the topic about a long build time has beed discussed many times, for example in this post or here.

Therefore, this question is more oriented on providing a solution or providing some options to avoid (or reduce) model build time. As I previously stated, some applications may require the use of multiple deep learning solutions. Consider that the application employs 10 completely different deep learning solutions that are powered by TensorRT. The build time may varry and it could be from 1-5min depending on the architecture. If customers choose to install the application with those models, it may take a very long time to build all of them on their machine. Customers might avoid to use it because they might see it as an obstacle.

So what would be optimal solution to this issue?

Currently, I don’t see a solution or any other options; rather, I expect the build time to increase with a new TensorRT version 8.0.1. where it’s stated

Engine build times for TensorRT 8.0 may be slower than TensorRT 7.2 due to the engine optimizer being more aggressive.

Best regards,
Andrej