Doubts on reusing serialized engines on different platforms or TensorRT versions

Hi experts,

I found some places have below note:
Note: Serialized engines are not portable across platforms or TensorRT versions.

like here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html#platform-matrix

And I found the explanation is:
TensorRT includes import methods to help you express your trained deep learning model for TensorRT to optimize and run. It is an optimization tool that applies graph optimization and layer fusion and finds the fastest implementation of that model leveraging a diverse collection of highly optimized kernels, and a runtime that you can use to execute this network in an inference context.

My questions are:

  1. Could I serialize and reuse the engines on different machines (include same platforms and different platforms)? Will they work?
  2. If they can work, how much loss/extra time cost they would have?

Any comments will be appreciated.
Thanks!

Hi microlj,

I’ve asked basically this question on this forum before, without a complete answer: https://devtalk.nvidia.com/default/topic/1046137/tensorrt/serialized-engine-validity/post/5308506/#5308506

In our applications, we’ve assumed that different (device-specific) CUDA Compute Capability or different TensorRT version (including minor version) implied incompatible PLANs. Whether equal (ComputeCapability, TRTVersion) imply compatible PLANs in general, I am not sure. As mentioned in the thread I linked, the driver version might matter as well.

Good luck,
Tom

@tom.peters

Many thanks for you response. Your answer solved my doubts.

So the PLAN file cannot across ComputeCapability, TRTVersion, and even driver version.

So the PLAN file cannot across ComputeCapability, TRTVersion, and even driver version.

I don’t know this for sure, but it’s my operating assumption. NVIDIA claimed (in the thread I linked) that driver version doesn’t matter, but my colleague reported to me that it did affect him.

Good luck,
Tom