How can we build a tensorRT model just once and run on different GPUs?

Description

We have some GPUs with different compute capabilities (RTX2080Ti,V100,GTX1080Ti,M40 and etc.). How can we build a tensorRT model just once and then run on those GPUs?

Thanks.

Environment

TensorRT Version: 5.0.2.6
GPU Type: RTX2080Ti,V100,GTX1080Ti,M40 and so on
Nvidia Driver Version: 430.40
CUDA Version: cuda10.0
CUDNN Version: cudnn7.3.1
Operating System + Version: Ubuntu 16.04.5 LTS
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Serialized engines are not portable across platforms or TensorRT versions. Engines are specific to the exact GPU model they were built on (in addition to the platforms and the TensorRT version).
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-710-ea/tensorrt-developer-guide/index.html#serial_model_c

Thanks

Thank you very much for your reply @SunilJB

By the way, can we say that serilized engines are compatible within the GPUs when they have the same compute capability?

Thanks.

Engines are specific to the exact GPU model they were built on.

Thanks