Hello, I’m working on convert ONNX to TensorRT engine using for deepstream. Model is SSD and have custom NMS Layer post processing.
The first time, original SSD output location output with dims [batch size, default boxes, 4] and confident output [batch size, default boxes, num classes]. The output parse to custom NMS on CPU.
Let’s example with same frame. After filter background class and we num boxes valid are 64 boxes (max confident).
The second times, I custom NMS layer plugin on TensortRT and want to add to engine. Then before add this NMS layer I and to add argmax (TopK layer) in network with input is confident output. to get scores and indices of classes.
The problems I met here is output of when filter background class to get valid boxes are 44 for the same frame I used to test before.
That mean some how when I add layer to network for onnx to convert engine. I thought the weight of network was change or do something make output confident out of network down.
I want to know reason it happen and any idea to fix it?
TensorRT Version: 6.0
GPU Type: GeForce GTX 1080 (D-GPU)
Nvidia Driver Version: 440.59
CUDA Version: 10.2
CUDNN Version: 10.1.243
Operating System + Version: Ubuntu 18.01
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.5
Baremetal or Container (if container which image + tag):