Running BatchedNMSDynamic_TRT plugin on C++ gives all 0 output

Description

As the title suggested, I’m trying to run a detection engine model with an attached BatchedNMSDynamic_TRT plugin.
The results of the model are always 0.

I have tested the same engine file on Python and it worked flawlessly. But still, I am attaching the code I used to generate the engine file.

  • Translating YOLOv5 output into the correct format mentioned here
class ProcModel(nn.Module):
    def __init__(self, model, class_num):
        super(ProcModel, self).__init__()
        self.model = model
        self.num_classes = class_num

    def forward(self, x):
        out = self.model(x)[0]
        bbox_out = torch.unsqueeze(out[:,:,:4], 2)

        x1 = bbox_out[:,:,:,0] - bbox_out[:,:,:,2] / 2
        y1 = bbox_out[:,:,:,1] - bbox_out[:,:,:,3] / 2
        x2 = bbox_out[:,:,:,0] + bbox_out[:,:,:,2] / 2
        y2 = bbox_out[:,:,:,1] + bbox_out[:,:,:,3] / 2

        bbox_out = torch.stack((x1, y1, x2, y2), dim=3)
        conf_out = out[:,:,4:5]
        class_out = out[:,:,5:] * conf_out

        return [bbox_out, class_out]
  • Converting to onnx:
    torch.onnx.export(
        model.cpu(), 
        im.cpu(),
        f,
        export_params=True,
        verbose=False,
        opset_version=opset,
        do_constant_folding=True, 
        input_names=['images'],
        output_names=output_names,
        dynamic_axes=dynamic)
  • Creating and attaching BatchedNMSDynamic_TRT node to ONNX graph.
def create_attrs(input_h, input_w, topK, keepTopK):
    attrs = {}
    attrs["shareLocation"] = 1
    attrs["backgroundLabelId"] = -1
    attrs["numClasses"] = 80
    attrs["topK"] = topK
    attrs["keepTopK"] = keepTopK
    attrs["scoreThreshold"] = 0.25
    attrs["iouThreshold"] = 0.6
    attrs["isNormalized"] = False
    attrs["clipBoxes"] = False
    attrs["plugin_version"] = "1"

    return attrs


def add_nmsplugin_to_onnx(model_file, output_names=('output0-bbox', 'output0-class'), topk=200, keepTopK=100):
    graph = gs.import_onnx(onnx.load(model_file))  # load onnx model
    batch_size = graph.inputs[0].shape[0]
    input_h = graph.inputs[0].shape[2]
    input_w = graph.inputs[0].shape[3]
    tensors = graph.tensors()
    boxes_tensor = tensors[output_names[0]] # match with onnx model output name
    confs_tensor = tensors[output_names[1]] # match with onnx model output name
    num_detections = gs.Variable(name="num_detections").to_variable(dtype=np.int32, shape=[batch_size, 1])
    nmsed_boxes = gs.Variable(name="nmsed_boxes").to_variable(dtype=np.float32, shape=[batch_size, keepTopK, 4])
    nmsed_scores = gs.Variable(name="nmsed_scores").to_variable(dtype=np.float32, shape=[batch_size, keepTopK])
    nmsed_classes = gs.Variable(name="nmsed_classes").to_variable(dtype=np.float32, shape=[batch_size, keepTopK])
    new_outputs = [num_detections, nmsed_boxes, nmsed_scores, nmsed_classes] # do not change
    nms_node = gs.Node( # define nms plugin
        op="BatchedNMSDynamic_TRT",  # match with batchedNMSPlugn
        attrs=create_attrs(input_h, input_w, topk, keepTopK),  # set attributes for nms plugin
        inputs=[boxes_tensor, confs_tensor],
        outputs=new_outputs
    )
    graph.nodes.append(nms_node)  # nms plugin added
    graph.outputs = new_outputs
    graph = graph.cleanup().toposort()
    
    onnx.save(gs.export_onnx(graph), model_file)  # save model
    return model_file
  • I have tested my C++ code with other models and it worked fine as well.

So my conclusion is:

  1. I didn’t initialize the Plugin libraries correctly. Particularlly, I run
initLibNvInferPlugins(&m_logger, "");

before deserializing the engine file.
2. There is a bug in this tensorrt plugin.

Environment

TensorRT Version: 8.4.3.1
Installed with TensorRT-8.4.3.1.Linux.x86_64-gnu.cuda-11.6.cudnn8.4.tar.gz
GPU Type: RTX 3090
Nvidia Driver Version: 470.239.06
CUDA Version: 11.4.4
CUDNN Version: 8.2.2
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.10.9
PyTorch Version (if applicable): 2.0.0
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/cuda:11.4.3-cudnn8-devel-ubuntu20.04

Relevant Files

My C++ code is similar to this: GitHub - cyrusbehr/tensorrt-cpp-api: TensorRT C++ API Tutorial

1 Like

Hi @ttungnguyen2205 ,
Can you please help us with your model to debug it further?

Thanks

1 Like