Problems with nvidia SSDDetectionOutputPlugin


Currently I am trying to implement S^3FD face detector on TensorRT
Which is based on SSD detector, but I don’t want to parse from caffe prototxt file and trying to import the weight and build network native in tensorrt.
Now I have some problem about the last layer - detection_out.

Creation of network is ok but during enqueue to context a quest, TensorRT died without any error messaage. And Jupyter notebook kernel died.

I want to use Nvidia plugin layer created by

TENSORRTAPI INvPlugin * createSSDDetectionOutputPlugin(DetectionOutputParameters param)

I created the param structure like:

const char* createDetectionOutputPlugin(const char* name_c) {
        DetectionOutputParameters param = {
            true, false, 0, 2, 5000, 750, float(0.05), float(0.3), CodeTypeSSD::CENTER_SIZE,
            {0, 1, 2}, true, false
        IPlugin *ptr = createSSDDetectionOutputPlugin(param);
        return insertPlugin(name_c, ptr);

The input is mbox_loc, mbox_conf_flat, mbox_priorbox in right order.
Their output shape is like: (input image is at 1024 * 768)

mbox_loc: (263340, 1, 1)
mbox_conf_flat: (131670, 1, 1)
mbox_priorbox: (2, 263340, 1)

This shape is without batch N, and in DimsCHW format. I am not sure whether this is acceptable by detection output plugin.

PS1. the output shapes in caffe of these three layers are (1, 263340), (1, 131670), (1, 2, 263340), at least the number is correct. So I guess it is a problem of shape dimension order.

PS2. It should be the problem with DetectionOutput plugin, because I have tried erase this layer only and everything works fine during execution.

PS3. I have checked samples and in sampleUffSSD sample the network was converted from tensorflow and parse from UFF. I actually generate the plugin in the same way but didn’t work.

Any advice will be appreciated!

I’m almost sure that this should be a bug…

According to
My understanding for DetectionOutput plugin should be correct.
I also checked the value of the input tensors and they are all OK.
Additionally I tried to change the tensor shape all result in error message from TensorRT

Problem solved!

It is a problem related to topK parameter in DetectionOutput layer.
5000 is too large that TensorRT crushed during runtime.
After I reduce to below 2500 it runs fine.

Hi, @raobystorm,

Can I ask how did you solve the score prediction with multiple background labels in s^3fd? I found that you put the background label as 0. How did you deal with other background labels? You process it before NMS or you did it after the NMS?


Hi, @liuyoungshop
I assume by saying multiple background labels you mean the 2-classes predictions come from different size convolutional layers. Please refer to the original caffe implementation of S^3FD from SFZhang15:

You can find their deploy.prototxt in downloads from GoogleDrive or BaiduYun.

They flattened, permuted and concatenated all background labels into one tensor, before send it to NMS.
Which is corresponding to my label 0 input for DetectionOutput plugin.