Thanks a lot!
I think I’ve got some good news this time. With Retinanet, trt engine could finally produces results comparable to those of tlt model, only with some minor differences.
However, I observed a weird thing: according to the repo of NMS plugin that Retinanet used, there would be two tensors and the 2nd one will contain information on the count of nmsed box, while in my case, the second output tensor always returns a tiny float value, like 1.2e-44. On the other hand, the first tensor containing information on bboxs is normally generated as described in the plugin repo, so I could get correct predictions. Still, I’m wondering why I got those odd numbers in the 2nd tensor ? (even it doesn’t hurt my detection results)
Here’s the section describing the NMS output in NMS plugin repo:
After decoding, the decoded boxes will proceed to the non maximum suppression step, which performs the same action as
batchedNMSPlugin. The only difference is that instead of generating four outputs:
nmsed box count(1 value)
nmsed box locations(4 values)
nmsed box scores(1 value)
nmsed box class IDs(1 value)
nmsPlugingenerates an output of shape
[batchSize, 1, keepTopK, 7]which contains the same information as the outputs
nmsed box locations,
nmsed box scores, and
nmsed box class IDsfrom
batchedNMSPlugin, and an another output of shape
[batchSize, 1, 1, 1]which contains the same information as the output
nmsed box countfrom
Glad to hear that you can get the good results.
For “the second output tensor always returns a tiny float value, like 1.2e-44” you mentioned, is it 100% reproduced? If yes, could you try to download the default retinanet model in GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream and test its trt engine too? If it can be reproduced, please attach the reproduce steps. Thanks a lot.
cd deepstream_tlt_apps/ wget https://nvidia.box.com/shared/static/8k0zpe9gq837wsr0acoy4oh3fdf476gq.zip -O models.zip unzip models.zip rm models.zip
In my case, it produces 100% and I’ll try with the default model and come back to you later.
Could you please share the inference script which you used?