I’m optimizing my tensorflow model using tensorrt (not tf-trt because of its higher memory usage). I have a .pb model and I’m thinking of using the tf->onnx->engine flow for optimization but I have a couple of custom layers for which I need to create plugin layers. I want to use the Python API.
According to this I need to create the plugin in C++, “package the layer using pybind11 in Python” and load it in the python application. Here are my questions:
I’d chosen the tf-onnx-trt flow because I read in some forum post that that was the recommended flow. Is that true? Is there a reason this tensorflow custom layers section talkes about the uff flow?
What does packaging the layer using pybind entail? Is there documentation or an example for it?
Does “creating the plugin in C++” here mean creating the IPluginV2DynamicExt subclass and registering a plugin creator with a specific name and version? Because, the python API does have field collection functionality and the create_plugin function.
In my optimization workflow I’m using an onnx parser to parse the model into an engine file. At what point in that workflow do I proceed to add the plugin layers? After creating the network and before building the engine?
How do I specify the point at which I want to add a plugin layer in a network definition?
The documentation page that you sent didn’t really go into the details of implementing the actual plugin in C++. Is there any documentation for that? Something that describes exactly what functions need to implemented in a plugin class and a plugin creator class?
Also this section says that to run tensorflow networks in tensorrt you need to convert them to uff – is this updated? Since, I remember reading that tf-onnx-tensorrt workflow is the new recommended flow over the uff flow
My model has many common layers like LeakyRelu, BatchNorm, Transpose, Relu, Maxpool, Reshape, Add, Exp, Div and Mul that are not listed here. I don’t see them in the open source list of plugins in the TRT github repo either. Does this mean I have to write a C++ plugin for each of these layers, register, create the plugin and stitch it into my model?
Since these layers are very common and I’m finding it hard to believe that people go through this long pipeline for all these layers.
Is there something I’m missing? Surely there should be a faster workflow for this?