TensorRT fails on second execution

Description

I’m trying to run an onnx model using onnxruntime with tensorrt backend. The issue is about onnxruntime but I think the main reason is tensorrt. The nature of our problem requires dynamic output so I exported the model from pytorch with dynamic axes option. When I use the same input it does not fail to execute, so I think it is because of dynamic output size.

Environment

TensorRT Version: 8.0.3.4
GPU Type: GTX 3070
Nvidia Driver Version: 470.86
CUDA Version: 11.4
CUDNN Version: 8
Operating System + Version: Pop OS 21
Python Version (if applicable): 3.8.10
PyTorch Version (if applicable): 1.8.0
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:21.10-py3 docker

  • Full traceback of errors encountered
    2021-12-31 10:01:41.809326312 [I:onnxruntime:, sequential_executor.cc:155 Execute] Begin execution
    2021-12-31 10:01:43.149471475 [W:onnxruntime:Default, tensorrt_execution_provider.h:53 log] [2021-12-31 10:01:43 WARNING] Detected invalid timing cache, setup a local cache instead
    2021-12-31 10:03:09.086521516 [I:onnxruntime:, sequential_executor.cc:155 Execute] Begin execution
    “succeeds here”
    2021-12-31 10:03:09.086521516 [I:onnxruntime:, sequential_executor.cc:155 Execute] Begin execution
    2021-12-31 10:03:09.094996243 [V:onnxruntime:, execution_frame.cc:529 AllocateMLValueTensorSelfOwnBufferHelper] For ort_value with index: 138, block in memory pattern size is: 512 but the actually size is: 768, fall back to default allocation behavior
    2021-12-31 10:03:09.101589372 [W:onnxruntime:Default, tensorrt_execution_provider.h:53 log] [2021-12-31 10:03:09 WARNING] Detected invalid timing cache, setup a local cache instead
    2021-12-31 10:03:10.108307205 [E:onnxruntime:Default, tensorrt_execution_provider.h:51 log]
    “fails here”
    [2021-12-31 10:03:10 ERROR] 2: [graph.cpp::checkSanity::1300] Error Code 2: Internal Error (Assertion !overlap(t->start, t->extent, f.first, f.second) failed.)
    Segmentation fault (core dumped)
1 Like

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html#onnx-export

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Hello, I checked with onnx checker and it passed. However, it is failed with trtexec because NonZero operation is not supported. Onnxruntime can build the engine(I guess it supports a hybrid structure) but can’t do inference on different shapes. I’m currently working on changing NonZero nodes with Where nodes using graphsurgeon. There might be more unsupported operators. Is there any other way to avoid NonZero? Here is the output of trtexec:

[01/06/2022-17:48:12] [E] [TRT] ModelImporter.cpp:720: While parsing node number 570 [NonZero -> "4436"]:
[01/06/2022-17:48:12] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[01/06/2022-17:48:12] [E] [TRT] ModelImporter.cpp:722: input: "4435"
output: "4436"
name: "NonZero_2684"
op_type: "NonZero"

[01/06/2022-17:48:12] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[01/06/2022-17:48:12] [E] [TRT] ModelImporter.cpp:725: ERROR: builtin_op_importers.cpp:4643 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[01/06/2022-17:48:12] [E] Failed to parse onnx file
[01/06/2022-17:48:12] [I] Finish parsing network model
[01/06/2022-17:48:12] [E] Parsing model failed
[01/06/2022-17:48:12] [E] Engine creation failed
[01/06/2022-17:48:12] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8003] # trtexec --onnx=folded.onnx --batch=1 --saveEngine=folded.trt

And here is the onnx model, I couldn’t upload it to the forum due to size limitation of 100mb:

And you can find an inference script attached. You will need espeak phonemizer, torch library(any version after 1.8.0 is ok, it might work with lover versions.) , and onnxruntime built with tensorrt. Thanks!
inf.py (5.5 KB)

Hi,

We currently do not support the NonZero operator, which is why you are seeing this error. We have plans to support this in a future release.

Also implementing custom plugin may be difficult.

Thank you.