Linux trained model to Windows TensorRT engine

What is the proposed workflow for converting a model (SavedModel,…) that was trained under Linux to a TensorRT engine that can be deployed under Windows?

As far as i understand the TensorFlow with TensorRT (TF-TRT) is not supported under Windows. Running this optimization scheme under Linux and trying to use the serialized engine / plan under Windows ist not possible as engines/plans are not portable across OSs or different hardware. See support matrix.

So I guess the only possibility is to use an intermediate format (UFF,ONNX) to do the conversion under Linux and then use the TRT UFF/ONNX parsers under Windows to generate the engine? However, this seems like a rather cumbersome approach that may also suffer from incompatibilities or missing features between the different model representations. Are there alternative workflows or strategies?

Hi @jawacma,

You are correct. To convert the model on Windows you’d need to convert it from some intermediate format, preferably ONNX. See tf2onnx for converting your TF model to ONNX. Then try the ONNX parser.

A potential issue is that if not all ops are currently supported by TensorRT, TF-TRT would fallback to using the TF runtime. This won’t be possible on Windows as you mentioned since TF-TRT isn’t supported. However, if all ops in the model are supported by TensorRT, then you can create a complete TensorRT engine from the ONNX model, and this shouldn’t be a problem.

Here’s a recent thread where a Windows user converted a TF model -> ONNX -> TensorRT, in case it helps as a reference: https://github.com/NVIDIA/TensorRT/issues/428