Different engines give different inference results when using the same onnx model and giving the same input

Description

There are some custom operators in my onnx model, and I implemented their plugins on tensorrt. When I use tensorrt to read this onnx model and run it multiple times for the same input, different engines will be given. Some engines give correct inference results, while some engines give incorrect inference results.
This onnx file is exported with pytorch.
In this case, which aspect of the problem do I need to locate first? Thanks a lot.

Environment

TensorRT Version: 8.2 GA
GPU Type: rtx 3060
Nvidia Driver Version: 31.0.15.1702
CUDA Version: 11.4
CUDNN Version: 11.4
Operating System + Version: Windows 11
PyTorch Version (if applicable): 1.9.0+cu111

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

image

I have an onnx model with a total of three operators, among which operator 1 and operator 2 are my custom plug-ins. When parsing in tensorrt, the values ​​of output1 and output3 are both correct, but output2 is incorrect. But I copied the output pointer from the video memory with cudaMemcpy in enqueue() in Operator 2, and the printed value is correct.
How can I debug this problem? Thanks.

There are my custom operators in onnx. I use onnx.checker.check_model(model), it will report an error saying that there is no custom plug-in. But I use print(model) to see the network is correct.

Hi @wang.xiangru ,
Apologies for the delay.
There are many onnx plugins under $TRT_SOURCE/parsers/onnx/, like Split.hpp, ResizeNearest.hpp and etc.
They will be registered into system through ‘REGISTER_TENSORRT_PLUGIN’ automatically, so that your onnx model can be parsed directly during runtime.
You may try that.
Thanks