The SSDDetectionOutputPlugin's output is wrong on my own dataset

I use the TensorRT’s built-in function createSSDDetectionOutputPlugin to implement my ssd model, arfter fixing some bugs, I finished my first version. And the forward time is 8-9ms/image by using 1080ti without considering data preprocessing and memory communication between host and device.

After that, I change the ssd’s official model with my own model trained on my own dataset. It can detect some objects, however, some targets which are notable were not detected!!!

Down below here is the output of official SSD(but use my own model and data) and TensorRT4.0-SSD’s softmax and detection_output, I did a filtering operation to print only a specific class with the confidence not less than 0.9.

official SSD’s softmax output (use my own model and data, 5classes+1background, only print the class 4, Numbers in brackets are the anchor box’s id)

[6371]: 0.049193, 0.000025, 0.000018, 0.000181, 0.950499, 0.000084,
[6479]: 0.022842, 0.000004, 0.000002, 0.000037, 0.977074, 0.000042,
[6485]: 0.000009, 0.000002, 0.000001, 0.000007, 0.999956, 0.000024,
[6491]: 0.000482, 0.000008, 0.000002, 0.000016, 0.999474, 0.000017,
[6515]: 0.051670, 0.000033, 0.000031, 0.000425, 0.947679, 0.000162,
[6599]: 0.000084, 0.000002, 0.000002, 0.000011, 0.999850, 0.000052,
[6605]: 0.021234, 0.000008, 0.000005, 0.000018, 0.978697, 0.000039,
[6623]: 0.010059, 0.000010, 0.000004, 0.000075, 0.989802, 0.000049,
[6629]: 0.000005, 0.000005, 0.000003, 0.000022, 0.999889, 0.000077,
[6635]: 0.000034, 0.000010, 0.000008, 0.000036, 0.999834, 0.000078,
[6743]: 0.000840, 0.000009, 0.000008, 0.000023, 0.999012, 0.000109,
[6749]: 0.014012, 0.000016, 0.000024, 0.000031, 0.985813, 0.000104,
[8134]: 0.089027, 0.000092, 0.000021, 0.000179, 0.910595, 0.000087,
[8146]: 0.043592, 0.000238, 0.000042, 0.000333, 0.955646, 0.000149,

and it’s detection output(1confidence + 4 coordinate values)
[0]: 0, 4, 0.999956, 0.102853, 0.232255, 0.394229, 0.490054,
[1]: 0, 4, 0.999889, 0.364039, 0.280085, 0.678515, 0.539498,
[2]: 0, 4, 0.619877, 0.206382, 0.252745, 0.499788, 0.506061,
[3]: 0, 1, 0.027311, 0.645953, -0.011268, 0.779389, 0.039354,

TensorRT4.0-SSD’s softmax output
[6371]: 0.049193, 0.000025, 0.000018, 0.000181, 0.950499, 0.000084,
[6479]: 0.022842, 0.000004, 0.000002, 0.000037, 0.977074, 0.000042,
[6485]: 0.000009, 0.000002, 0.000001, 0.000007, 0.999956, 0.000024,
[6491]: 0.000482, 0.000008, 0.000002, 0.000016, 0.999474, 0.000017,
[6515]: 0.051669, 0.000033, 0.000031, 0.000425, 0.947679, 0.000162,
[6599]: 0.000084, 0.000002, 0.000002, 0.000011, 0.999850, 0.000052,
[6605]: 0.021234, 0.000008, 0.000005, 0.000018, 0.978697, 0.000039,
[6623]: 0.010059, 0.000010, 0.000004, 0.000075, 0.989802, 0.000049,
[6629]: 0.000005, 0.000005, 0.000003, 0.000022, 0.999889, 0.000077,
[6635]: 0.000034, 0.000010, 0.000008, 0.000036, 0.999834, 0.000078,
[6743]: 0.000840, 0.000009, 0.000008, 0.000023, 0.999012, 0.000109,
[6749]: 0.014012, 0.000016, 0.000024, 0.000031, 0.985813, 0.000104,
[8134]: 0.089027, 0.000092, 0.000021, 0.000179, 0.910594, 0.000087,
[8146]: 0.043592, 0.000238, 0.000042, 0.000333, 0.955646, 0.000149,

and it’s detection output
[0]: 0, 4, 0.999956, 0.102853, 0.232255, 0.394229, 0.490054
[1]: 0, 1, 0.0273107, 0.645953, 0, 0.779389, 0.0393541

TensorRT5.0-SSD’s softmax output
… SoftmaxOUT …
[6371]: 0.049193, 0.000025, 0.000018, 0.000181, 0.950499, 0.000084,
[6479]: 0.022842, 0.000004, 0.000002, 0.000037, 0.977074, 0.000042,
[6485]: 0.000009, 0.000002, 0.000001, 0.000007, 0.999956, 0.000024,
[6491]: 0.000482, 0.000008, 0.000002, 0.000016, 0.999474, 0.000017,
[6515]: 0.051669, 0.000033, 0.000031, 0.000425, 0.947679, 0.000162,
[6599]: 0.000084, 0.000002, 0.000002, 0.000011, 0.999850, 0.000052,
[6605]: 0.021234, 0.000008, 0.000005, 0.000018, 0.978697, 0.000039,
[6623]: 0.010059, 0.000010, 0.000004, 0.000075, 0.989802, 0.000049,
[6629]: 0.000005, 0.000005, 0.000003, 0.000022, 0.999889, 0.000077,
[6635]: 0.000034, 0.000010, 0.000008, 0.000036, 0.999834, 0.000078,
[6743]: 0.000840, 0.000009, 0.000008, 0.000023, 0.999012, 0.000109,
[6749]: 0.014012, 0.000016, 0.000024, 0.000031, 0.985813, 0.000104,
[8134]: 0.089027, 0.000092, 0.000021, 0.000179, 0.910594, 0.000087,
[8146]: 0.043592, 0.000238, 0.000042, 0.000333, 0.955646, 0.000149,

and it’s detection output
[0]: 0, 4, 0.999956, 0.102853, 0.232255, 0.394229, 0.490054
[1]: 0, 4, 0.999889, 0.364039, 0.280085, 0.678515, 0.539498
[2]: 0, 4, 0.61988, 0.206382, 0.252745, 0.499788, 0.506061
[3]: 0, 1, 0.0273107, 0.645953, 0, 0.779389, 0.0393541

The nms threshold and confidence threshold are the same, which is 0.45 and 0.01, respectively.
And even though I reduce the nms threshold to 0.01, it has no effect.

There are two same objects whose class index are 4 in the image I used. So, the TensorRT5.0 SSD and official SSD’s detection output is right at least, but TensorRT4.0 SSD’s detetion output misses one object. They have the same output at softmax, so there are reasons to doubt that the detection plugin in TensorRT4.0 has something wrong.

I also compared the TensorRT4.0 SSD and the official SSD at pascal voc dataset. The Softmax and previous layers’ output are almost the same between these two model, but there is a big difference after detection_output layer.

With the same input picture, the official SSD’s detection_output layer can output 55 bounding boxes with the confidence threshold 0.01, although most of the bounding boxes’ confidence are below 0.1.

But the TensorRT4.0 SSD’s detection_output can output only 8 bounding boxes.

The effective objects are just two or three, so this issue dosen’t affect the final detection results at pascal voc.

But when I switch to my own dataset, this issue affect the final detection results seriously as mentioned above.

I also tested the same detection at TensorRT 5.0 SSD. In a strange way, the Softmax’s output is different with the official SSD a little, but the detection_ouput’s output is very similar, e.g. TensorRT5.0-SSD can output 51 bounding box.

And with my own dataset, the TensorRT 5.0 SSD’s result is OK, there is no missing detection.

After some exploration, I found that this phenomenon appears only in TensorRT 4.0.

So, I am wondering is the createSSDDetectionOutputPlugin’s source code in TensorRT5.0 different with TensorRT4.0?

Due to my project needs, I should deploy the model to my Jetson TX2 finally, so I have to use TensorRT4.O because TensorRT5.0 is not available yet for jetson TX2. If I can’t solve this problem, the only way is to rewrote the detection plugin.

Hello,

many improvements and fixes have been committed for TensorRT 5.0 from 4.0. If you’d like to preview TRT5.0 for the Jetson platform, please consider trying JetPack 4.1.1 Developer Preview
, which contains TRT 5.x

But the release note said JetPack 4.1.1 Developer Preview only supports the NVIDIA Jetson AGX Xavier Developer Kit, not mentioned the Jetson Tx2.

And during the installation, at the step to select the development environment, there is only one option: Jetson AGX Xavier Developer Kit. Or is it supports all Jetson platforms including TX2?

I’m not sure if it’s possible to flash the os and cuda/cudnn etc. directly for Jetson TX2.

Can I extract only tensorrt 5.0 and just put it on my jetson TX2? Because I don’t want to upgrade the whole system in order to upgrade Tensorrt5.0.

If I can’t flash os and tensorrt5.0 through Jetpack4.1, my current system will be broken, and I have to reflash os and tensorrt4.0 through Jetpack3.3. And these unnecessary and useless operations are exactly what I don’t want to do, It takes too much time for me.