Workflow for adding plugin layer to tensorflow model


I’m optimizing my tensorflow model using tensorrt (not tf-trt because of its higher memory usage). I have a .pb model and I’m thinking of using the tf->onnx->engine flow for optimization but I have a couple of custom layers for which I need to create plugin layers. I want to use the Python API.
According to this I need to create the plugin in C++, “package the layer using pybind11 in Python” and load it in the python application. Here are my questions:

  • I’d chosen the tf-onnx-trt flow because I read in some forum post that that was the recommended flow. Is that true? Is there a reason this tensorflow custom layers section talkes about the uff flow?
  • What does packaging the layer using pybind entail? Is there documentation or an example for it?
  • Does “creating the plugin in C++” here mean creating the IPluginV2DynamicExt subclass and registering a plugin creator with a specific name and version? Because, the python API does have field collection functionality and the create_plugin function.
  • In my optimization workflow I’m using an onnx parser to parse the model into an engine file. At what point in that workflow do I proceed to add the plugin layers? After creating the network and before building the engine?
  • How do I specify the point at which I want to add a plugin layer in a network definition?

Please refer to below links related custom plugin implementation and sample:



Thanks for the links.

The documentation page that you sent didn’t really go into the details of implementing the actual plugin in C++. Is there any documentation for that? Something that describes exactly what functions need to implemented in a plugin class and a plugin creator class?

Also this section says that to run tensorflow networks in tensorrt you need to convert them to uff – is this updated? Since, I remember reading that tf-onnx-tensorrt workflow is the new recommended flow over the uff flow

My model has many common layers like LeakyRelu, BatchNorm, Transpose, Relu, Maxpool, Reshape, Add, Exp, Div and Mul that are not listed here. I don’t see them in the open source list of plugins in the TRT github repo either. Does this mean I have to write a C++ plugin for each of these layers, register, create the plugin and stitch it into my model?
Since these layers are very common and I’m finding it hard to believe that people go through this long pipeline for all these layers.
Is there something I’m missing? Surely there should be a faster workflow for this?

Hi @infinityp913,

Please find inputs for your queries in the description,

  1. Yes we recommend the ONNX workflow over UFF (deprecated) and TF-TRT when deploying Tensorflow models.
  2. For ONNX models with custom ops, this blog talks about using onnx-graphsurgeon tool to insert custom op (with correct) attributes so that TensorRT ONNX parser can offload the op to corresponding plugin.
  3. See this python sample for reference: TensorRT/samples/python/onnx_packnet at master · NVIDIA/TensorRT · GitHub
  4. The plugins used in your ONNX model must be registered with the Plugin registry before you parse it.
  5. Unless you are constructing the Network definition by hand, ONNX parser has a built-in fallback to plugins for ops that are not part of the standard ONNX op specification. You just have to ensure the name of the plugin matches the name of the custom ONNX op. FYI onnx-tensorrt/builtin_op_importers.cpp at dc22bb323ece3c65419717be8a0d3d0f318a61fa · onnx/onnx-tensorrt · GitHub

Layers which are not present in support matrix, you need to implement using custom layers.

Thank you.