Need help creating custom NMSPlugin.cpp

Description

I am currently trying to optimize SSDMobilenetV2 for inference speed on the Jetson TX2 by pruning anchor boxes that are not used by the classes in my dataset. The initial number of anchor boxes in the default implementation of SSDMobilenetV2 from the Tensorflow object detection zoo is 1917. And after pruning the number of boxes comes to 1885. Not a significant decrease but still I plan to do more optimizations later on and hence this info might be useful.

I am successfully able to convert the default SSDMobilenetV2 model to a TensorRT binary with the plugins provided by TensorRT, but I cannot convert the custom model. The compilation stops by throwing an aassertion error at line 246 of NMSPlugin.cpp. This is the line which checks the following:

ASSERT(numPriors * numLocClasses * nbBoxCoordinates == inputDims[param.inputOrder[0]].d[0]);

Printing the values of these variables for different input orders (given in ) I get,

pruned:[021]   numPriors:0    numLocClasses:1 C1:2    C2:5655 C3:1
pruned:[012]   numPriors:0    numLocClasses:1 C1:2    C2:7540 C3:1
pruned:[102]   numPriors:0    numLocClasses:1 C1:7540 C2:2    C3:1
pruned:[120]   numPriors:1917 numLocClasses:1 C1:7540 C2:5655 C3:7668
pruned:[201]   numPriors:0    numLocClasses:1 C1:5655 C2:2    C3:1
pruned:[210]   numPriors:1917 numLocClasses:1 C1:5655 C2:7540 C3:7668

The right value of numPriors should be 1885 (which is 7540/4). But I get 1917(which is 7668/4) which is the older number of boxes. Why do I get the older value of anchor boxes here?
I do not understand from where inputDims gets its values. Is this configurable?

Environment

TensorRT Version: 7.1.0
GPU Type: PASCAL
Nvidia Driver Version:
CUDA Version: 10.2
CUDNN Version: 8.0
Operating System + Version: L4T from JetPack 4.4 DP
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.15.0
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Log when using trtexec:

...
[08/09/2020-15:57:13] [V] [TRT] Plugin creator registration succeeded - ::Split
[08/09/2020-15:57:13] [V] [TRT] Plugin creator registration succeeded - ::SpecialSlice_TRT
[08/09/2020-15:57:13] [V] [TRT] Plugin creator registration succeeded - ::InstanceNormalization_TRT
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Parsing MultipleGridAnchorGenerator[Op: _GridAnchor_TRT].
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Parsing concat_priorbox[Op: Concat]. Inputs: MultipleGridAnchorGenerator
[08/09/2020-15:57:14] [V] [TRT] UFFParser: concat_priorbox -> [2,7668,1]
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Applying order forwarding to: concat_priorbox
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Parsing Input[Op: Input].
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Input -> [1,3,300,300]
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Applying order forwarding to: Input
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Parsing FeatureExtractor/MobilenetV2/Conv/weights[Op: Const].
[08/09/2020-15:57:14] [V] [TRT] UFFParser: FeatureExtractor/MobilenetV2/Conv/weights -> [3,3,3,32]
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Applying order forwarding to: FeatureExtractor/MobilenetV2/Conv/weights
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Parsing FeatureExtractor/MobilenetV2/Conv/Conv2D[Op: Conv]. Inputs: Input, FeatureExtractor/MobilenetV2/Conv/weights
[08/09/2020-15:57:14] [V] [TRT] UFFParser: Inserting transposes for FeatureExtractor/MobilenetV2/Conv/Conv2D
[08/09/2020-15:57:14] [E] [TRT] UffParser: Parser error: FeatureExtractor/MobilenetV2/Conv/Conv2D: Order size is not matching the number dimensions of TensorRT
[08/09/2020-15:57:14] [E] Failed to parse uff file
[08/09/2020-15:57:14] [E] Parsing model failed
[08/09/2020-15:57:14] [E] Engine creation failed
[08/09/2020-15:57:14] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --uff=tmp_v2_coco.uff --uffInput=image_tensor:0,3,300,300 --output=NMS:0 --fp16 --verbose --saveEngine=trt.bin

Log using build_engine.py:

python build_engine.py

[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_13/expand/Conv2D + FeatureExtractor/MobilenetV2/expanded_conv_13/expand/Relu6 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_interior_nn_v1
[TensorRT] VERBOSE: BoxPredictor_0/ClassPredictor/Conv2D || BoxPredictor_0/BoxEncodingPredictor/Conv2D (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_13/project/Conv2D (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_interior_nn_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_14/expand/Conv2D + FeatureExtractor/MobilenetV2/expanded_conv_14/expand/Relu6 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_interior_nn_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_14/project/Conv2D + FeatureExtractor/MobilenetV2/expanded_conv_14/add (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_interior_nn_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_15/expand/Conv2D + FeatureExtractor/MobilenetV2/expanded_conv_15/expand/Relu6 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_interior_nn_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_15/project/Conv2D + FeatureExtractor/MobilenetV2/expanded_conv_15/add (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_interior_nn_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_16/expand/Conv2D + FeatureExtractor/MobilenetV2/expanded_conv_16/expand/Relu6 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_interior_nn_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/expanded_conv_16/project/Conv2D (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_interior_nn_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/Conv_1/Conv2D + FeatureExtractor/MobilenetV2/Conv_1/Relu6 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_interior_nn_v1
[TensorRT] VERBOSE: BoxPredictor_1/ClassPredictor/Conv2D || BoxPredictor_1/BoxEncodingPredictor/Conv2D (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1
[TensorRT] VERBOSE: FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/Conv2D + FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/Relu6 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_interior_nn_v1
#assertionnmsPlugin.cpp,246
Aborted (core dumped)

Hi @rms45

This error states that there might be a mismatch between the training data and the deployment dimensions
However you may find help from this relevant post.

Thanks!

Hi @AakankshaS,
Thank you for the response. But I have already seen this post. I do not think this is relevant in my case. Neither am I using a wrong output name such as MarkOutput or using a batch dimension. Also as i said earlier, I was able to build the unmodified version of SSD to a TensorRT binary without any problems. However when I change the number of anchor boxes, I get this error. I have not made any other changes to the model. So I think the problem lies elsewhere.

Hi @rms45,
Can you try the onnx conversion instead of uff?

Thanks!

Using ONNX results in the following error:
[TensorRT] VERBOSE: Plugin creator already registered - ::GridAnchor_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::NMS_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::Reorg_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::Region_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::Clip_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::LReLU_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::PriorBox_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::Normalize_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::RPROI_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::BatchedNMS_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::FlattenConcat_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::CropAndResize
[TensorRT] VERBOSE: Plugin creator already registered - ::DetectionLayer_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::Proposal
[TensorRT] VERBOSE: Plugin creator already registered - ::ProposalLayer_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::PyramidROIAlign_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::ResizeNearest_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::Split
[TensorRT] VERBOSE: Plugin creator already registered - ::SpecialSlice_TRT
[TensorRT] VERBOSE: Plugin creator already registered - ::InstanceNormalization_TRT
WARNING:tensorflow:From onnx_from_tf.py:24: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

[19, 10, 5, 3, 2, 1]
WARNING:tensorflow:From /home/vanderlande/.venvs/optimizationofdlmodels/lib/python3.6/site-packages/graphsurgeon/node_manipulation.py:106: The name tf.NodeDef is deprecated. Please use tf.compat.v1.NodeDef instead.

WARNING: To create TensorRT plugin nodes, please use the create_plugin_node function instead.
Traceback (most recent call last):
  File "onnx_from_tf.py", line 36, in <module>
    g = tf.import_graph_def(g2.as_graph_def(), name='')
  File "/home/vanderlande/.venvs/optimizationofdlmodels/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/vanderlande/.venvs/optimizationofdlmodels/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 405, in import_graph_def
    producer_op_list=producer_op_list)
  File "/home/vanderlande/.venvs/optimizationofdlmodels/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 501, in _import_graph_def_internal
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'GridAnchor_TRT' in binary running on JetsonTx2. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

This is related to issue https://github.com/onnx/tensorflow-onnx/issues/768. It seems if tf.contrib is used, the onnx conversion does not work.

Hi @rms45,
Can you please share your onnx model?

Thanks!

This error occurs before I can convert the model to ONNX. It seems UFF is the only option for now.

Hi @rms45,

The reason i suggested you to try ONNX is because UFF parser is deprecated from TRT 7 onwards, hence we plan to remove the support in the subsequent major release.
There is a suggestion given in the post you have shared, which might be the possible solution

Thanks!

Hi @AakankshaS,
Yes but ONNX does not support all the ops for SSD currently. And UFF does support it because I was able to use it to convert a generic model. I just wanted to know from where the values of numPriors, numLocClasses are passed on to the plugin. Is there a way to visualize the tensorrt graph with the plugins so I can debug this?