I’m optimizing my tensorflow model using tensorrt (not tf-trt because of its higher memory usage). I have a .pb model and I’m thinking of using the tf->onnx->engine flow for optimization but I have a couple of custom layers for which I need to create plugin layers. I want to use the Python API.
According to this I need to create the plugin in C++, “package the layer using pybind11 in Python” and load it in the python application. Here are my questions:
- I’d chosen the tf-onnx-trt flow because I read in some forum post that that was the recommended flow. Is that true? Is there a reason this tensorflow custom layers section talkes about the uff flow?
- What does packaging the layer using pybind entail? Is there documentation or an example for it?
- Does “creating the plugin in C++” here mean creating the IPluginV2DynamicExt subclass and registering a plugin creator with a specific name and version? Because, the python API does have field collection functionality and the create_plugin function.
- In my optimization workflow I’m using an onnx parser to parse the model into an engine file. At what point in that workflow do I proceed to add the plugin layers? After creating the network and before building the engine?
- How do I specify the point at which I want to add a plugin layer in a network definition?