Embed EfficientNMS_TRT in Detectnet_v2?

Hello, I was wondering if there is a way to include a EficientNMS_TRT node into a Detectnet_v2 model.

Thanks in advance,

A.R.

Do you mean adding EfficientNMS plugin for detectnet_v2 model?

The detectnet_v2 network does not need the efficientNMSPlugin. You can refer to an onnx file shared in ngc. PeopleNet | NVIDIA NGC (This Peoplenet is based on detectnet_v2.)

Thanks for your answer. I have read the Overview page and it says:

The raw normalized bounding-box and confidence detections needs to be post-processed by a clustering algorithm such as DBSCAN or NMS to produce final bounding-box coordinates and category labels.

I also tried to inference over the same image without NMS and i got this:

[Info] ClassIdx : 0 BBox : 217.083,92.2635,274.658,244.922,0.710876
[Info] ClassIdx : 0 BBox : 437.166,94.5434,481.46,241.757,0.682586
[Info] ClassIdx : 0 BBox : 935.308,117.561,959,189.226,0.698606
[Info] ClassIdx : 0 BBox : 216.717,92.9694,274.271,245.104,0.972943
[Info] ClassIdx : 0 BBox : 284.686,100.842,336.643,242.641,0.962959
[Info] ClassIdx : 0 BBox : 437.209,94.7442,481.274,242.46,0.927103
[Info] ClassIdx : 0 BBox : 886.661,126.729,915.335,204.351,0.751277
[Info] ClassIdx : 0 BBox : 935.964,118.592,959,189.246,0.69253
[Info] ClassIdx : 0 BBox : 216.71,92.9623,274.199,245.414,0.979063
[Info] ClassIdx : 0 BBox : 284.467,101.73,336.37,242.749,0.984146
[Info] ClassIdx : 0 BBox : 437.538,95.3918,481.424,242.929,0.913765
[Info] ClassIdx : 0 BBox : 886.697,126.382,915.404,204.691,0.769886
[Info] ClassIdx : 0 BBox : 936.239,118.674,959,188.475,0.643953
[Info] ClassIdx : 0 BBox : 216.944,92.6469,274.08,245.129,0.971306
[Info] ClassIdx : 0 BBox : 284.965,101.808,336.538,242.232,0.980957
[Info] ClassIdx : 0 BBox : 437.916,95.0066,481.629,242.432,0.814889
[Info] ClassIdx : 0 BBox : 576.82,20.3457,695.445,466.057,0.69949
[Info] ClassIdx : 0 BBox : 709.294,26.8829,871.178,442.878,0.893714
[Info] ClassIdx : 0 BBox : 886.88,126.596,915.511,204.009,0.673661
[Info] ClassIdx : 0 BBox : 217.409,91.9538,274.233,245.056,0.867004
[Info] ClassIdx : 0 BBox : 285.033,101.376,336.64,242.366,0.922016
[Info] ClassIdx : 0 BBox : 335.203,69.9186,466.903,442.236,0.811863
[Info] ClassIdx : 0 BBox : 334.684,69.5818,467.329,442.252,0.871875
[Info] ClassIdx : 0 BBox : 576.74,20.412,695.889,460.088,0.950435
[Info] ClassIdx : 0 BBox : 576.007,20.0761,695.799,458.282,0.852363
[Info] ClassIdx : 0 BBox : 710.064,26.7809,871.013,442.158,0.853983
[Info] ClassIdx : 0 BBox : 709.167,26.9706,871.673,442.603,0.975539
[Info] ClassIdx : 0 BBox : 709.235,26.3547,872.555,442.348,0.810509
[Info] ClassIdx : 0 BBox : 335.595,69.6403,466.66,442.379,0.963616
[Info] ClassIdx : 0 BBox : 334.733,69.3518,467.173,442.026,0.979517
[Info] ClassIdx : 0 BBox : 577.167,23.2131,696.551,459.51,0.954419
[Info] ClassIdx : 0 BBox : 576.336,23.258,696.098,459.189,0.925915
[Info] ClassIdx : 0 BBox : 710.432,30.6075,871.02,440.251,0.916114
[Info] ClassIdx : 0 BBox : 709.744,30.4306,871.515,440.465,0.986125
[Info] ClassIdx : 0 BBox : 710.706,29.6628,872.202,441.676,0.939766
[Info] ClassIdx : 0 BBox : 335.973,68.4424,467.153,442.925,0.981306
[Info] ClassIdx : 0 BBox : 335.206,68.3646,466.832,442.848,0.989305
[Info] ClassIdx : 0 BBox : 577.497,21.8929,696.423,459.92,0.980338
[Info] ClassIdx : 0 BBox : 576.503,22.3794,696.378,460.048,0.947262
[Info] ClassIdx : 0 BBox : 710.743,30.1496,871.392,443.76,0.963037
[Info] ClassIdx : 0 BBox : 710.133,29.5173,871.616,443.914,0.99346
[Info] ClassIdx : 0 BBox : 710.922,28.5739,871.755,444.542,0.971339
[Info] ClassIdx : 0 BBox : 336.39,69.5195,467.066,441.93,0.975489
[Info] ClassIdx : 0 BBox : 335.221,69.7361,466.984,442.406,0.985309
[Info] ClassIdx : 0 BBox : 577.744,21.9864,695.73,460.786,0.965658
[Info] ClassIdx : 0 BBox : 577.288,22.8156,695.865,460.846,0.92344
[Info] ClassIdx : 0 BBox : 711.216,30.0447,871.36,442.671,0.942679
[Info] ClassIdx : 0 BBox : 710.489,29.9431,871.209,442.918,0.989985
[Info] ClassIdx : 0 BBox : 710.948,28.712,871.609,444.301,0.956846
[Info] ClassIdx : 0 BBox : 336.47,71.8915,466.686,441.471,0.975637
[Info] ClassIdx : 0 BBox : 335.112,71.971,466.704,441.605,0.981413
[Info] ClassIdx : 0 BBox : 577.997,21.74,695.25,460.731,0.970773
[Info] ClassIdx : 0 BBox : 577.05,22.7061,695.225,461.1,0.891574
[Info] ClassIdx : 0 BBox : 711.332,27.9592,871.115,444.329,0.907406
[Info] ClassIdx : 0 BBox : 710.772,28.1383,870.794,444.294,0.981254
[Info] ClassIdx : 0 BBox : 710.573,27.7406,871.625,445.26,0.934614
[Info] ClassIdx : 0 BBox : 336.148,71.3104,466.834,440.799,0.970667
[Info] ClassIdx : 0 BBox : 335.074,71.1953,466.976,441.567,0.983484
[Info] ClassIdx : 0 BBox : 577.469,21.7754,694.571,458.194,0.962961
[Info] ClassIdx : 0 BBox : 576.827,21.8949,694.697,459.62,0.824045
[Info] ClassIdx : 0 BBox : 711.448,26.2714,870.059,443.797,0.832528
[Info] ClassIdx : 0 BBox : 710.408,27.3515,870.27,444.083,0.960979
[Info] ClassIdx : 0 BBox : 709.828,26.688,870.845,444.415,0.822641
[Info] ClassIdx : 0 BBox : 335.997,69.7976,466.767,442.938,0.960353
[Info] ClassIdx : 0 BBox : 335.157,69.871,467.141,442.717,0.978603
[Info] ClassIdx : 0 BBox : 576.891,21.1273,694.17,458.704,0.949254
[Info] ClassIdx : 0 BBox : 576.485,22.8823,694.402,459.158,0.610501
[Info] ClassIdx : 0 BBox : 711.202,26.1608,869.419,446.592,0.602616
[Info] ClassIdx : 0 BBox : 710.528,26.1878,869.661,446.7,0.910665
[Info] ClassIdx : 0 BBox : 335.372,69.9096,467.44,442.535,0.907313
[Info] ClassIdx : 0 BBox : 334.894,70.1276,467.6,442.089,0.915124
[Info] ClassIdx : 0 BBox : 576.981,20.9806,693.461,460.821,0.660124
[Info] ClassIdx : 0 BBox : 710.677,27.3999,869.129,446.711,0.615867

This is why I was wondering if I could use efficientNMSPlugin.

You can run inference with something like tao_tutorials/notebooks/tao_launcher_starter_kit/detectnet_v2/specs/detectnet_v2_inference_kitti_tlt.txt at main · NVIDIA/tao_tutorials · GitHub.
There is nms_confidence_threshold or dbscan_confidence_threshold mentioned in the user guide. DetectNet_v2 - NVIDIA Docs.
You can also leverage the code in tao_tensorflow1_backend/nvidia_tao_tf1/cv/detectnet_v2/postprocessor/utilities.py at c7a3926ddddf3911842e057620bceb45bb5303cc · NVIDIA/tao_tensorflow1_backend · GitHub.

That is done on CPU. I did a CUDA postprocessing kernel to feed OpenCV NMSboxes function but for some reason it takes 8 ms to fetch the output and 6 ns to run the kernel on a AGX Orin.

I will investigate if I can add some nodes to the ONNX and include the NMS plugin there.