Writing layer for NonMaxSuppression in onnx parser

I am working on writing a layer in onnx parser for NonMaxSuppression op. For this, I am adding DEFINE_BUILTIN_OP_IMPORTER in builtin_op_importers.cpp from onnx-tensorrt backend.
Tensorrt has BatchedNMS plugin for this op. However, the input and output params mentioned in: https://github.com/NVIDIA/TensorRT/tree/master/plugin/batchedNMSPlugin#parameters tensorrt plugin
and onnx parser op: https://github.com/onnx/onnx/blob/master/docs/Operators.md#nonmaxsuppression does not match.

1 Like

Hi @roshanchaudhari,
can you please clarify what is your question?

Hi @shayNV thanks for responding. As I mentioned above, there is no exact matching existing plugin/layer for NonMaxSuppression op. So it seems that the only option is modifying BatchedNMS_TRT plugin to return the indices of the boxes to match the output: https://github.com/onnx/onnx/blob/master/docs/Operators.md#nonmaxsuppression ? Or is there any other way?

So I assumed there is no other way to do this and tried writing a layer on onnx-tensorrt backend. In builtin_ops_importer.cpp I wrote a importer:

DEFINE_BUILTIN_OP_IMPORTER(NonMaxSuppression)
{
 // NonMaxSuppression is not supported opset below 10.
 ASSERT(ctx->getOpsetVersion() >= 10, ErrorCode::kUNSUPPORTED_NODE);

 nvinfer1::ITensor* boxes_tensor = &convertToTensor(inputs.at(0), ctx);
 nvinfer1::ITensor* scores_tensor = &convertToTensor(inputs.at(1), ctx);
 const int numInputs = inputs.size();
 LOG_ERROR("no of inputs are "<<numInputs);
 LOG_ERROR("node outsize and op type are "<<node.output().size()<< " type " << node.op_type());

const auto scores_dims = scores_tensor->getDimensions();
 const auto boxes_dims = boxes_tensor->getDimensions();
 LOG_ERROR("boxes dims "<< boxes_dims.nbDims << " dim3 has size "<<boxes_dims.d[2]);
    const std::string pluginName = "BatchedNMS_TRT";
 const std::string pluginVersion = "1";
 std::vector<nvinfer1::PluginField> f;

 bool share_location = true;
 const bool is_normalized = true;
 const bool clip_boxes = true;
 int backgroundLabelId = 0;
// Initialize.
 f.emplace_back("shareLocation", &share_location, nvinfer1::PluginFieldType::kINT8, 1);
 f.emplace_back("isNormalized", &is_normalized, nvinfer1::PluginFieldType::kINT8, 1);
 f.emplace_back("clipBoxes", &clip_boxes, nvinfer1::PluginFieldType::kINT8, 1);
 f.emplace_back("backgroundLabelId", &backgroundLabelId, nvinfer1::PluginFieldType::kINT32, 1);
 // Create plugin from registry
 nvinfer1::IPluginV2* plugin = importPluginFromRegistry(ctx, pluginName, pluginVersion, node.name(), f);

 ASSERT(plugin != nullptr && "NonMaxSuppression plugin was not found in the plugin registry!",
     ErrorCode::kUNSUPPORTED_NODE);

 std::vector<nvinfer1::ITensor*> nms_inputs ={boxes_tensor, scores_tensor};
 RETURN_FIRST_OUTPUT(ctx->network()->addPluginV2(nms_inputs.data(), nms_inputs.size(), *plugin));
}

However, when I try to run the above code, it crashes at:

nvinfer1::plugin::BatchedNMSPlugin::getOutputDimensions() where it fails for the ASSERT(inputs[0].nbDims == 3); however, in DEFINE_BUILTIN_OP_IMPORTER(NonMaxSuppression) in my above function, it prints inputs[0].nbDims = 3. Why assertion fails in getOutputDimensions().?

I tried to trace it but the call is coming from libinfer runtime library:

#0  nvinfer1::plugin::BatchedNMSPlugin::getOutputDimensions (this=0x5555654b44c0, index=2, inputs=0x5555654b4800, nbInputDims=2)
at trt_src/TensorRT/TensorRT/plugin/batchedNMSPlugin/batchedNMSPlugin.cpp:70
#1  0x00007fffe9e735fd in nvinfer1::PluginV2Layer::getOutputForm(int, std::vector<nvinfer1::TensorForm, std::allocator<nvinfer1::TensorForm> > const&) const ()
   from /usr/lib/x86_64-linux-gnu/libnvinfer.so.7
#2  0x00007fffe9f277ef in nvinfer1::Network::updateTensor(nvinfer1::NetworkTensor const*) const () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.7
#3  0x00007fffe9f27d0a in nvinfer1::NetworkTensor::getDimensions() const () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.7
#4  0x00005555555a3bd2 in onnx2trt::TensorOrWeights::shape (this=0x5555654b22c0) at /onnx-tensorrt/TensorOrWeights.hpp:96
#5  onnx2trt::parseGraph (ctx=ctx@entry=0x55555628e110, graph=..., deserializingINetwork=<optimized out>, currentNode=currentNode@entry=0x55555628e3c0)
at ModelImporter.cpp:187
#6  0x00005555555a6a7f in onnx2trt::ModelImporter::importModel (this=0x55555628e0d0, model=..., weight_count=<optimized out>, weight_descriptors=<optimized out>)
at ModelImporter.cpp:521

upon further debug I found that dimensions are not correct insidegetOutputDimensions(), if I create plugin with wildcard dimensions and then check the values in getOutputDimensions then they are correct. However, my input tensor does not have any wildcard dimensions.

Hi @roshanchaudhari,
Can you clear up your comment, did you solve the issue you were facing? or do you have another question?

thanks

Yeah, I did find workaround. Lets close this one.

However, I still have few questions about tensor-rt:

  1. If I am writing a new layer plugin, is it necessary to define extent of each dimension for the output if I am deriving it from IPluginV2DynamicExt ?
    For example If I am writing plugin for some op, I cannot define dimension even in terms of input dims, is that okay to use -1 constant for dimension ?
  2. How is output type decided if writing a plugin and using existing layer?

Dear @roshanchaudhari
I cannot define dimension even in terms of input dims, is that okay to use -1 constant for dimension ?

Does that mean you want to add -1 to plugin’s input buffers dims?

How is output type decided if writing a plugin and using existing layer?

You mean how to write getOutputDataType() for your plugin? what do you mean by using existing layer? Can you elaborate ?

#1. Please see below snippet.

DimsExprs MyPlugin::getOutputDimensions(
    int outputIndex, const nvinfer1::DimsExprs* inputs, int nbInputs, nvinfer1::IExprBuilder& exprBuilder)
{
  nvinfer1::DimsExprs output;
  output.nbDims = 2;
  output.d[0] = exprBuilder.constant(inputs[0].nbDims);
  
  //  Is this allowed ? If I cannot define extent of any of output dims. 
  output.d[1] = exprBuilder.constant(-1);
   return output;
}
  1. Solved.

Dear @roshanchaudhari,
Is the issue solved?

No. I am still waiting for the answer.

Dear @roshanchaudhari,
It is not allowed. The output dimensions must be computable from the input dimensions and parameters within the layer. Dimensions that depend on tensor data at runtime are not allowed.

may I know why you are looking for this use case?

For use case, if we want to write a plugin for NonZero op which basically returns indices of non zero elements from the tensor. In this case we cannot define output dimensions.

Dear @roshanchaudhari,
TRT has two face execution model. First phase is on CPU, computes values of shape tensors. Second phase is streamed on GPU and computed execution shape tensor. Information can only flow from phase 1 to 2.

One work around is that, the buffersize can be stored in buffer[0] or having two outputs one indicates buffer cound and other is to store nonzero indices. But the problem is none of
TRT dimensioning infrastructure can read buffer count from GPU. So you had to end the engine after this operation or have all layers as plugin layers which also have dimensions stored on GPU. Generally this wont be the case for a DNN. Please file a bug for this use case, our engineering team will prioritize it.

Hi everyone, I’ve been following this thread because I have the same issue. Basically need to register a NonMaximumSupression operation on onnx-tensorrt. So I wrote as adviced by @roshanchaudhari on builtin_op_importers.cpp the following:

DEFINE_BUILTIN_OP_IMPORTER(NonMaxSuppression)
{
        // NonMaxSuppression is not supported opset below 10.
        ASSERT(ctx->getOpsetVersion() >= 10, ErrorCode::kUNSUPPORTED_NODE);

        nvinfer1::ITensor* boxes_tensor = &convertToTensor(inputs.at(0), ctx);
        nvinfer1::ITensor* scores_tensor = &convertToTensor(inputs.at(1), ctx);
        const int numInputs = inputs.size();
        LOG_ERROR("no of inputs are "<<numInputs);
        LOG_ERROR("node outsize and op type are "<<node.output().size()<< " type " << node.op_type());

        const auto scores_dims = scores_tensor->getDimensions();
        const auto boxes_dims = boxes_tensor->getDimensions();
        LOG_ERROR("boxes dims "<< boxes_dims.nbDims << " dim3 has size "<<boxes_dims.d[2]);
        const std::string pluginName = "BatchedNMS_TRT";
        const std::string pluginVersion = "1";
        std::vector<nvinfer1::PluginField> f;

        bool share_location = true;
        const bool is_normalized = true;
        const bool clip_boxes = true;
        int backgroundLabelId = 0;
        // Initialize.
        f.emplace_back("shareLocation", &share_location, nvinfer1::PluginFieldType::kINT8, 1);
        f.emplace_back("isNormalized", &is_normalized, nvinfer1::PluginFieldType::kINT8, 1);
        f.emplace_back("clipBoxes", &clip_boxes, nvinfer1::PluginFieldType::kINT8, 1);
        f.emplace_back("backgroundLabelId", &backgroundLabelId, nvinfer1::PluginFieldType::kINT32, 1);
        // Create plugin from registry
        // nvinfer1::IPluginV2* plugin = importPluginFromRegistry(ctx, pluginName, pluginVersion, node.name(), f);
        nvinfer1::IPluginV2* plugin = createPlugin(node.name(), importPluginCreator(pluginName, pluginVersion), f);

        ASSERT(plugin != nullptr && "NonMaxSuppression plugin was not found in the plugin registry!",
                   ErrorCode::kUNSUPPORTED_NODE);

        std::vector<nvinfer1::ITensor*> nms_inputs ={boxes_tensor, scores_tensor};
        RETURN_FIRST_OUTPUT(ctx->network()->addPluginV2(nms_inputs.data(), nms_inputs.size(), *plugin));
}

However I got the following error:

[2020-10-27 16:19:27   ERROR] /opt/onnx-tensorrt/builtin_op_importers.cpp:122: no of inputs are 5
[2020-10-27 16:19:27   ERROR] /opt/onnx-tensorrt/builtin_op_importers.cpp:123: node outsize and op type are 1 type NonMaxSuppression
[2020-10-27 16:19:27   ERROR] /opt/onnx-tensorrt/builtin_op_importers.cpp:127: boxes dims 3 dim3 has size 4
#assertion/home/jenkins/workspace/OSS/L0_MergeRequest/oss/plugin/batchedNMSPlugin/batchedNMSPlugin.cpp,77
Aborted (core dumped)

Anyone has a solid advice to solve this issue?