Description
I have used a tflite file of ssdlite mobileNet v2 object detection model.
-
Used tf2onnx for converting the tflite file to ONNX as shown below :
python -m tf2onnx.convert --opset 11 --tflite [tflite_model.onnx|attachment](upload://5tc3mBujNZH78fOCm8jh7Q1EtkT.onnx) (17.2 MB) [nms_plugin.onnx|attachment](upload://5kPdlB4I65UNO4y35nYKsbJhxl5.onnx) (17.1 MB) [model.tflite|attachment](upload://wPNHhfylbURDZDSJkqYxZTD9zHA.tflite) (17.1 MB) .tflite --output model_tflite.onnx
-
Replace the NMS layer with BatchedNMSDynamic_TRT plugin
import onnx_graphsurgeon as gs
import onnx
import numpy as np
input_model_path = "tflite_model.onnx"
output_model_path = "nms_plugin.onnx"
@gs.Graph.register()
def trt_batched_nms(self, boxes_input, scores_input, nms_output,
share_location, num_classes):
boxes_input.outputs.clear()
scores_input.outputs.clear()
attrs = {
"shareLocation": share_location,
"numClasses": num_classes,
"backgroundLabelId": 0,
"topK": 100,
"keepTopK": 100,
"scoreThreshold": 0.3,
"iouThreshold": 0.6,
"isNormalized": True,
"clipBoxes": True,
"scoreBits": 16
}
return self.layer(op="BatchedNMSDynamic_TRT", attrs=attrs,
inputs=[boxes_input, scores_input],
outputs=nms_output)
# load the graph
graph = gs.import_onnx(onnx.load(input_model_path))
graph.inputs[0].shape = [1, 300, 300, 3]
print(graph.inputs[0].shape)
tmap = graph.tensors()
outArray = ["TFLite_Detection_PostProcess", "TFLite_Detection_PostProcess:1", "TFLite_Detection_PostProcess:2",
"TFLite_Detection_PostProcess:3"]
for i in range(len(outArray)):
nms_out_test = tmap[outArray[i]]
nms_out_test.inputs.clear()
nms_out = []
for i in range(len(outArray)):
nms_out.append(tmap[outArray[i]])
# Can also get attributes from the original graph instead of hard-coding
graph.trt_batched_nms(tmap["concat"], tmap["convert_scores"],
nms_out, share_location=False,
num_classes=90)
# set the graph
graph.outputs[0].dtype = np.int32
# clean the graph
graph.cleanup().toposort()
# save the onnx model
onnx.save_model(gs.export_onnx(graph), output_model_path)
print("Saving the ONNX model to {}".format(output_model_path))
- Convert the onnx to .TRT :
trtexec --onnx=nms_plugin.onnx --saveEngine=TRT_Engine.trt --explicitBatch --verbose
Output : Error
[08/05/2021-17:31:20] [V] [TRT] Tactic: 1002 time 0.121252
[08/05/2021-17:31:20] [V] [TRT] Tactic: 0 time 0.011772
[08/05/2021-17:31:20] [V] [TRT] Fastest Tactic: 0 Time: 0.011772
[08/05/2021-17:31:20] [V] [TRT] *************** Autotuning format combination: Float(1,4,4,7668), Float(1,91,174447) -> Int32(1,1), Float(1,4,400), Float(1,100), Float(1,100) ***************
[08/05/2021-17:31:20] [V] [TRT] Formats and tactics selection completed in 102.21 seconds.
[08/05/2021-17:31:20] [V] [TRT] After reformat layers: 109 layers
[08/05/2021-17:31:20] [V] [TRT] Block size 16777216
[08/05/2021-17:31:20] [V] [TRT] Block size 8640000
[08/05/2021-17:31:20] [V] [TRT] Block size 3240448
[08/05/2021-17:31:20] [V] [TRT] Block size 1440256
[08/05/2021-17:31:20] [V] [TRT] Block size 394240
[08/05/2021-17:31:20] [V] [TRT] Block size 218624
[08/05/2021-17:31:20] [V] [TRT] Block size 51200
[08/05/2021-17:31:20] [V] [TRT] Block size 19968
[08/05/2021-17:31:20] [V] [TRT] Block size 17408
[08/05/2021-17:31:20] [V] [TRT] Block size 9728
[08/05/2021-17:31:20] [V] [TRT] Block size 9216
[08/05/2021-17:31:20] [V] [TRT] Block size 2560
[08/05/2021-17:31:20] [V] [TRT] Block size 2560
[08/05/2021-17:31:20] [V] [TRT] Block size 1024
[08/05/2021-17:31:20] [V] [TRT] Block size 512
[08/05/2021-17:31:20] [V] [TRT] Total Activation Memory: 30824960
[08/05/2021-17:31:20] [I] [TRT] Detected 1 inputs and 4 output network tensors.
[08/05/2021-17:31:20] [F] [TRT] Assertion failed: in[0].desc.dims.d[2] == numLocClasses
/home/jenkins/agent/workspace/OSS/OSS_L0_MergeRequest/oss/plugin/batchedNMSPlugin/batchedNMSPlugin.cpp:321
Aborting...
#####################################
Observations:
The output of the tflite graph was this which was actually the anchor boxes of 1917, not the raw bounding boxes accepted by the BatchedNMSDynamic_TRT plugin[TensorRT/plugin/batchedNMSPlugin at master · NVIDIA/TensorRT · GitHub]
So how and which plugin node needs to be added to achieve this functionality
Environment
TensorRT Version: 7.2.4
CUDA Version: 10
Operating System + Version: ubuntu20
Relevant Files
Issue Details: How to add NMS with Tensorflow Model (that was converted to ONNX) · Issue #1379 · NVIDIA/TensorRT · GitHub
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)