Questions about custom plugin?

Hi
I have implemented the Einsum operation and can also successfully converted my Pytorch model to ONNX and to TRT(with Einsum op). only for torch.einsum('nctkv,kvw->nctw')
I set the elements of input[0] to be 1, but the input[0] get in enqueue() are not all 1. I have run the program several times and sometimes they are all 1, but sometimes there are only two 1. I don’t know what is wrong.
for example, I will get one of the following two results:

right input[0]:
1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000     1.000
wrong input[0]:
1.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     1.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000     0.000

Can you please assist me in resolving this error?

Environment:
CUDA: 11.0
TRT version: 7.1.3
cudnn:8.0.1
pytorch1.6

this is my code:

You can creat a simple network to replicate this problem as follows

class model(nn.Module):
    def __init__(self, in_channel=1):
        super().__init__()
        self.A = np.ones(shape=(3, 15, 15))
        self.A = torch.tensor(self.A, dtype=torch.float32, requires_grad=False)

    def forward(self, x: torch.Tensor):
        x = x.permute(0, 2, 3, 1, 4).contiguous()
        x = torch.einsum('nctkv,kvw->nctw', x, self.A)
        return x

input_tensor = torch.ones(size=[1, 3, 1, 1, 15])
Model = model()
Model.cuda()
out = Model(input_tensor)

input_name = ['input']
output_name = ['output'] 
torch.onnx.export(Model,
                  input_tensor,
                  './gcn.onnx',
                  input_names=input_name, output_names=output_name,
                  verbose=True,
                  opset_version=12
                  )

and then convert gcn.onnx to TRT by this Einsum plugin, you will reproduce these results above.

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

hi @NVES

Here’s all my code

You can repeat my question according to README.md.
Please let me know if you have any questions.
Thank you very much.

Hi @15033523736,

Could you please share ONNX model here or via google drive for better assistance, we are facing some issues when we try to run above repo.

Thank you.

Hi @spolisetty ,
thanks

Here is the visualisation of my ONNX model, it is very simple , only Transpose and Einsum layers for testing the custiom Einsum plugin.

it is already in my reop and you can download it directly or generate it by running python generate_onnx.py. The link is: gcn.onnx

Hi @15033523736,

Sorry for not making it clear in my previous reply. We request you to please share detailed error logs (verbose) and results for better assistance.

Thank you.

Hi @spolisetty ,

I implemented the Einsum plugin and was also able to successfully complete the conversion of the ONNX model to the TRT model. It also works fine for model inference and no errors are reported, so I have no way to provide an error log.

However, as README.md in my repository says, the input I give to Einsum plugin is an all-1 matrix (as in Figure1), but the result printed by Einsum’s enque function shows that the received matrix is not a matrix of all 1, (as in Figure2). and I don’t know how to troubleshoot this problem.

So I have given all my code in the repo that you can reproduce my problem in order to understand what I am talking about. Which step went wrong ? Compiling the Einsum plugin? Converting ONNX to TensorRT? Running the sample code? Or can’t get the same results as in README.md?

Hi @15033523736,

At compiling the Einsum plugin. Please allow us some time to work on this issue, meanwhile we recommend you to please try on latest TensorRT 8.0 version and let us know if you still face this issue.

Thank you.

Thanks @spolisetty ,

I tried on lastest TensorRT8.0 version, but the problem still persists. I don’t know what’s wrong😭️

I have solved this problem by editing the supportsFormatCombination , only return true for the input’s format is nvinfer1::PluginFormat::kLINEAR . Here is my completed EinsumPlugin: GitHub - xn1997/TensorRT-EinsumPlugin.
But I don’t know why this is happening.😂️

Thank you @15033523736 for letting us know.

1 Like