Hi Nvidia,
version info: tensorrt 7.1.0, xavier AGX.
issue:
I’m trying to convert a onnx model into a fp32 engine by using trtexec, the inference result of the generated engine is largely different with original model. Additionally, we tried to generate engines with absolutely the same code and model, every time we got a different result, and it seems not the normal float precision issue.
Here is the partial result from pytorch model, with all input was set as 0:
-7.89227366e-01
-6.38025475e+00
-5.79495907e+00
-5.96910524e+00
-5.54781723e+00
-4.90355968e+00
-4.21461630e+00
-3.69260979e+00
-3.38714862e+00
-3.21675205e+00
And here is inference result of generated engine, of 2 different engine with absolutely the same code, plugin and model.
so you can see the difference between pytorch result & engine inference & another generated engine.
How to reproduce the issue?
We used a self-defined plugin called DCNv2(deformable convolution), so first you can generate the .so plugin by cmake & make the code I gave in the attachment. after compiling, you will get a libTest_Tensorrt.so.
Then you can run trtexec:
./trtexec --onnx=monoflex.onnx --plugins=libTest_TensorRT.so --workspace=3000 --saveEngine=trt_mono_fp32.engine
ps. some version of tensorrt seems always to search plugin version as “1” instead of “001”, in this case you can change line40 of DCNv2_nvi.cpp from:
static const char* DCNV2_VERSION{"001"};
to
static const char* DCNV2_VERSION{"1"};
so,
Could anyone please explain why it happens and how to make sure our engine infers the correct result in this case?
monoflex onnx: monoflex.onnx - Google Drive
plugin_staff.zip (265.8 KB)