Is there a plan to enable custom layers in GIE (TensorRT)?

Current API of GIE seems to only support some pre-defined layers, such as FC, CONV, etc, which is quite limited. I am wondering if there is a plan to add some API to enable us to define custom layers in the future?

If by “some API” you mean something like a Caffe python layer, probably not any time soon. Otherwise, you might want to give some examples of what kinds of custom layers you would like to be able to define.

It’s certainly part of the plan that TensorRT capabilities will expand in the future.

If you have specific ideas, consider filing an RFE. Go to developer.nvidia.com, sign up as a registered developer if not already, and file a bug. Include RFE in the bug description, and describe some characteristics of the custom layer you would like to be able to use/specify, and what kind of API you would use to specify it.

@txbob Thanks for your reply! We want to do this: define some layer class and implement the, say ‘forward’, method, and then feed the class into GIE to generate an optimised inference network.
Something like this:

//  provided by GIE
class GIECustomLayer {
    virtual forward(...) = 0;    
};

//  custom subclass
class AdditionLayer : GIECustomLayer {
    void forward(...) {
        ...
    }
};

auto layer = new AdditionLayer(...);
network->addCustomLayer(layer, ...);

I’m personally not optimistic that such a general layer definition method could be incorporated any time soon. TensorRT needs (i.e. its purpose is) to be able to convert the network definition into something that will run efficiently on a GPU. Such an arbitrary definition gives no clues as to how one might do it, and is akin to the aforementioned python layer (IMO) and is effectively asking for a solution to the question “how to parallelize this code?” for any arbitrary code set.

There are no solutions to do that today in computer science that I am aware of, which is why I think your request is a tall order.

Nevertheless feel free to file the aforementioned RFE.

If you can come up with a more constrained layer definition that imposes a regularized structure for which parallelization might be simplified, you may have better luck. As an example, think about how cuDNN specifies networks (or how a network is specified to cuDNN, if you prefer). If you can provide a CUDA kernel (instead of your bare “forward” function) that takes a cuDNN data set as input and output, you may be much closer to something that is feasible for a “custom” layer.

@txbob, thanks! I totally understand your words, and believe it is true. In fact I am working on R&D on autopilot. I will trying to convert our model to GIE and locate some concrete issues and requests to file to NV:)