TF2-TRT convert to standalone TRT plan steps missing

In the TF_TRT user guide https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html

Section 2.10 describes how to create a stand-alone TensorRT plan
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#tensorrt-plan

It gives the following procedure

for n in trt_graph.node:
  if n.op == "TRTEngineOp":
    print("Node: %s, %s" % (n.op, n.name.replace("/", "_")))
    with tf.gfile.GFile("%s.plan" % (n.name.replace("/", "_")), 'wb') as f:
      f.write(n.attr["serialized_segment"].s)
  else:
    print("Exclude Node: %s, %s" % (n.op, n.name.replace("/", "_")))

“where trt_graph is the output of create_inference_graph”

create_inference_graph is removed in TensorFlow >= 2.0
How do you suggest we can convert our frozen function to a .plan in Tensorflow >= 2.0?

1 Like

Hi,

Please refer to below links:
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#worflow-with-savedmodel
https://github.com/tensorflow/tensorrt/blob/master/tftrt/examples/image-classification/image_classification.py#L158
https://github.com/tensorflow/tensorrt/tree/master/tftrt/examples

Thanks

Hello,

convert_variables_to_constants_v2 returns a graph function, not a graph def.
Therefore, it has no .node attribute.
There is a vital step missing from your instructions.
Also, the github examples that you link do not serialize the frozen graph function to a .plan.

Please, update the instructions so that you can go from frozen graph function to a .plan.

For example:

frozen_graph_def = frozen_func.graph.as_graph_def()

However, when you do this, your TRTEngineOp is actually empty when you print it.

For example:

frozen_graph_def = frozen_func.graph.as_graph_def()

However, when you do this, your TRTEngineOp is actually empty when you print it.

Hello,

After getting the frozen_graph_def, how do you suggest we can further serialize to a .plan? The ‘serialized_segment’ attr of ‘TRTEngineOp’ is actually empty.

Hi,

I completely ditched the serialization to a .plan file since the documentation is hard to follow/incomplete.
Right now, the most straightforward way is to go from TF2 → ONNX and then parse your ONNX model with TensorRT C++ code. Still quite some work but at least its understandable and it actually functions.

2 Likes

I stumbled about the exact same problem: For the preceding steps of the pipeline the documentation nicely distinguishes between TF1 and TF2. However in section 2.10 it does not even explain that create_inference_graph is only usable in TF1 and it does not explain how to create a .plan file with TF2.

Is the recommended way for TF2 really to create ONNX first and then create the .plan from there?

Is there any update on this by now? I am having this problem for quite a while now, and I use the TF2->ONNX, ONNX->TensorRT approach now, but the inference time is a lot worse than when using TF-TRT.

I get an inference time of 25ms with TF-TRT, and 700ms with the TensorRT plan.