When to update a tensorrt engine file?

Description

(1) We have computers with a GPU, with a model that we have converted to tensorrt and cached on disk. Imagine we want to update the version of CUDA or CuDNN or tensorrt. For these changes, must we re-convert the model to tensorrt? Are there other changes that would require us to reconvert the model? Under what conditions would it simply be a good idea to re-convert the model, but perhaps not a requirement?

(2) I believe it is recommended to convert models to tensorrt using the GPU architecture of choice. E.g. if we are deploying on a tegra, we should do the conversion on that specific type of tegra. Is that correct? If we deploy to multiple GPU architectures, it seems to behoove us to deploy a generic model format (e.g. onnx) and then convert it on the device. Some of our models are large, and require 10s of seconds to convert. For this circumstance, it would seem that our system would be forced to have downtime when we upgrade our model. What is the recommended way to avoid downtown?

Thanks!

Hi ,
We recommend you to check the supported features from the below link.

You can refer below link for all the supported operators list.
For unsupported operators, you need to create a custom plugin to support the operation

Thanks!

Thanks for your quick answer. I think you are saying:

  1. It doesn’t matter where we convert to tensorrt files.
    It could be on a machine with a target GPU or a different GPU.
  2. There is backwards compatibility for newer tensorrt versions reading in older tensorrt files. The only reason to re-convert would be if a newer versions of tensorrt failed to read an older version, which should not happen.

Are the above two points correct?

In Developer Guide :: NVIDIA Deep Learning TensorRT Documentation, It says that
Engines created by TensorRT are specific to both the TensorRT version with which they were created and the GPU on which they were created.

So, they look incorrect to me. It looks like when we upgrade to a newer version of tensorrt, we should we-convert. It also looks like we should do the correct conversion for each GPU.

Hi,

Yes. Currently, TensorRT does not support the above features. We need to re-build the TensorRT engine file.

Thank you.

Thank you spolisetty.

Given what you wrote:

(1) How does one minimize downtime when uploading a new network? If it takes a while to convert to tensorrt?

(2) I would guess one would download pre-converted tensorrt files to the board. If that is true, I need to know which tegra I am running on. What’s best practices for figuring out which board you are running on?

Thanks so much!

Do you mean engine build time? we can increase the workspace for faster conversion.

You need to rebuild the TRT engine on the same machine/board you’re planning to deploy.

Thank you.