Description
Hello, I’m working on convert ONNX to TensorRT engine using for deepstream. Model is SSD and have custom NMS Layer post processing.
The first time, original SSD output location output with dims [batch size, default boxes, 4] and confident output [batch size, default boxes, num classes]. The output parse to custom NMS on CPU.
Let’s example with same frame. After filter background class and we num boxes valid are 64 boxes (max confident).
The second times, I custom NMS layer plugin on TensortRT and want to add to engine. Then before add this NMS layer I and to add argmax (TopK layer) in network with input is confident output. to get scores and indices of classes.
The problems I met here is output of when filter background class to get valid boxes are 44 for the same frame I used to test before.
That mean some how when I add layer to network for onnx to convert engine. I thought the weight of network was change or do something make output confident out of network down.
I want to know reason it happen and any idea to fix it?
Environment
TensorRT Version: 6.0
GPU Type: GeForce GTX 1080 (D-GPU)
Nvidia Driver Version: 440.59
CUDA Version: 10.2
CUDNN Version: 10.1.243
Operating System + Version: Ubuntu 18.01
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.5
Baremetal or Container (if container which image + tag):