Error converting Bi-LSTM model from onnx to tensorrt

Description

I get the following error when trying to convert a model to tensorrt.

[01/18/2021-21:25:38] [E] [TRT] Parameter check failed at: ../builder/Network.cpp::addConcatenation::517, condition: nbInputs > 0 && nbInputs < MAX_CONCAT_INPUTS
Segmentation fault (core dumped)

Here’s a more verbose description of the error

[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:183: custom_rnn_scan_Scan__43 [Scan] outputs: [custom_rnn_scan_Scan__43:0 -> (-1, -1)], [custom_rnn_scan_Scan__43:1 -> (-1, -1)], [custom_rnn_scan_Scan__43:2 -> ()], [custom_rnn_scan_Scan__43:3 -> ()], [custom_rnn_scan_Scan__43:4 -> (-1, -1, -1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:107: Parsing node: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1 [Transpose]
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: custom_rnn_scan_Scan__43:4
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:129: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1 [Transpose] inputs: [custom_rnn_scan_Scan__43:4 -> (-1, -1, -1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:154: Registering layer: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1 for ONNX node: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:120: Registering tensor: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1:0 for ONNX tensor: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:183: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1 [Transpose] outputs: [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose_1:0 -> (-1, -1, -1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:107: Parsing node: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152 [Shape]
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: shadow/cnn/fully_connected/BatchNorm/FusedBatchNormV3__141:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:129: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152 [Shape] inputs: [shadow/cnn/fully_connected/BatchNorm/FusedBatchNormV3__141:0 -> (-1, -1, -1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:154: Registering layer: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152 for ONNX node: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:120: Registering tensor: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152:0 for ONNX tensor: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:183: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152 [Shape] outputs: [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152:0 -> (3)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:107: Parsing node: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164 [Gather]
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: const_starts__205
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:129: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164 [Gather] inputs: [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Shape__152:0 -> (3)], [const_starts__205 -> (1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:150: Registering constant layer: const_starts__205_5 for ONNX initializer: const_starts__205
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/builtin_op_importers.cpp:973: Using Gather axis: 0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:154: Registering layer: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164 for ONNX node: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:120: Registering tensor: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164:0 for ONNX tensor: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:183: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164 [Gather] outputs: [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164:0 -> (1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:107: Parsing node: Expand__165 [Expand]
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__163:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:129: Expand__165 [Expand] inputs: [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__164:0 -> (1)], [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_Gather__163:0 -> (1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:154: Registering layer: Expand__165 for ONNX node: Expand__165
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ImporterContext.hpp:120: Registering tensor: Expand__165:0 for ONNX tensor: Expand__165:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:183: Expand__165 [Expand] outputs: [Expand__165:0 -> (-1)],
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:107: Parsing node: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_ReverseSequence__166 [ReverseSequence]
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:123: Searching for input: Expand__165:0
[01/18/2021-21:25:38] [V] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:129: shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_ReverseSequence__166 [ReverseSequence] inputs: [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose:0 -> (-1, -1, -1)], [Expand__165:0 -> (-1)],
[01/18/2021-21:25:38] [E] [TRT] Parameter check failed at: ../builder/Network.cpp::addConcatenation::517, condition: nbInputs > 0 && nbInputs < MAX_CONCAT_INPUTS
Segmentation fault (core dumped)

I cannot share the original model. I am using a docker built using TensorRT-7.2.1.6

The ONNX model was generated from tf-1.15 with opset 11

Any leads are much appreciated.

Hi, Request you to share the ONNX model and the script so that we can assist you better.

Alongside you can try validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).

Alternatively, you can try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

Thanks!

The onnx checker does not give any output, so that part checked out.

Also, the error in my original post was the output of trtexec --onnx=my_model.onnx. Sorry for not clarifying that.

I am not in a position to share the onnx model, any additional pointers on how to debug this would be very helpful.

Hi @aravind.anantha,

Could you please share us complete logs for better debugging.
Also the error says nbInputs for concat should be between 1 and MAX_CONCAT_INPUTS(10000).
Please try to identify from your end, which conact node is failed.

Thank you.

Hi @spolisetty

Here are the attached complete logs.
concat_fail_logs_rnn.txt (132.6 KB)

I looked at all the operations with type Concat and all of them had some inputs to them. I do not see any with 0 inputs or >10000 inputs.

Please let me know if there is anything else I can check for, or if there is some way to find the exact place trtexec crashed.

Hi @aravind.anantha,

In logs don’t see any other errors. If possible could you please try out OSS repo.
Please rebuild debug version sample and parser and debug it to see which call in parser lead to the issue
In code we didn’t notice any error, call to addConcatenation in parser could lead to the issue.

Thank you.

Hi @spolisetty

I used gdb to get some more information.

[01/25/2021-20:54:11] [V] [TRT] shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/bw/ReverseV2_ReverseSequence__166 [ReverseSequence] inputs: [shadow/LSTMLayers/StackRNN/Layer0/bidirectional_rnn/fw/fw/transpose:0 -> (-1, -1, -1)], [Expand__165:0 -> (-1)],
[01/25/2021-20:54:11] [E] [TRT] Parameter check failed at: ../builder/Network.cpp::addConcatenation::517, condition: nbInputs > 0 && nbInputs < MAX_CONCAT_INPUTS

Thread 1 "trtexec_debug" received signal SIGSEGV, Segmentation fault.
0x00007f900752b28a in onnx2trt::(anonymous namespace)::importReverseSequence (ctx=0x55837f7d80f0, node=..., inputs=std::vector of length 2, capacity 2 = {...})
    at /workspace/TensorRT/parsers/onnx/builtin_op_importers.cpp:2818
2818        concatLayer->setAxis(batch_axis);

The stack trace of the crash:

#0  0x00007fafb9e4328a in onnx2trt::(anonymous namespace)::importReverseSequence (ctx=0x55a0fdb71170, node=..., inputs=std::vector of length 2, capacity 2 = {...})
    at /workspace/TensorRT/parsers/onnx/builtin_op_importers.cpp:2818
#1  0x00007fafb9e6c4a4 in std::_Function_handler<onnx2trt::ValueOrStatus<std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> > > (onnx2trt::IImporterContext*, onnx2trt_onnx::NodeProto const&, std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> >&), onnx2trt::ValueOrStatus<std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> > > (*)(onnx2trt::IImporterContext*, onnx2trt_onnx::NodeProto const&, std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> >&)>::_M_invoke(std::_Any_data const&, onnx2trt::IImporterContext*&&, onnx2trt_onnx::NodeProto const&, std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> >&) (__functor=..., __args#0=@0x7ffcf6a2ebf8: 0x55a0fdb71170, __args#1=...,
    __args#2=std::vector of length 2, capacity 2 = {...}) at /usr/include/c++/7/bits/std_function.h:302
#2  0x00007fafb9df93b9 in std::function<onnx2trt::ValueOrStatus<std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> > > (onnx2trt::IImporterContext*, onnx2trt_onnx::NodeProto const&, std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> >&)>::operator()(onnx2trt::IImporterContext*, onnx2trt_onnx::NodeProto const&, std::vector<onnx2trt::TensorOrWeights, std::allocator<onnx2trt::TensorOrWeights> >&) const (this=0x55a0fcc9d8b8, __args#0=0x55a0fdb71170, __args#1=..., __args#2=std::vector of length 2, capacity 2 = {...}) at /usr/include/c++/7/bits/std_function.h:706
#3  0x00007fafb9ded0c8 in onnx2trt::parseGraph (ctx=0x55a0fdb71170, graph=..., deserializingINetwork=false, currentNode=0x55a0fdb714c8) at /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:144
#4  0x00007fafb9df19fc in onnx2trt::ModelImporter::importModel (this=0x55a0fdb71130, model=..., weight_count=0, weight_descriptors=0x0) at /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:538
#5  0x00007fafb9df0c96 in onnx2trt::ModelImporter::parseWithWeightDescriptors (this=0x55a0fdb71130, serialized_onnx_model=0x55a10deb89c0, serialized_onnx_model_size=9066625, weight_count=0, weight_descriptors=0x0)
    at /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:453
#6  0x00007fafb9df0e12 in onnx2trt::ModelImporter::parse (this=0x55a0fdb71130, serialized_onnx_model=0x55a10deb89c0, serialized_onnx_model_size=9066625, model_path=0x0)
    at /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:469
#7  0x00007fafb9df4799 in onnx2trt::ModelImporter::parseFromFile (this=0x55a0fdb71130, onnxModelFile=0x55a0fcc9cb70 "/tmp/011421.onnx", verbosity=3) at /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:696
#8  0x000055a0fb7d4b22 in sample::modelToNetwork (model=..., network=..., err=...) at /workspace/TensorRT/samples/common/sampleEngines.cpp:140
#9  0x000055a0fb7d7763 in sample::modelToEngine (model=..., build=..., sys=..., err=...) at /workspace/TensorRT/samples/common/sampleEngines.cpp:607
#10 0x000055a0fb7d805d in sample::getEngine (model=..., build=..., sys=..., err=...) at /workspace/TensorRT/samples/common/sampleEngines.cpp:692
#11 0x000055a0fb81d62d in main (argc=3, argv=0x7ffcf6a30578) at /workspace/TensorRT/samples/opensource/trtexec/trtexec.cpp:149

Looks like an error with the ReverseSequence operator.

auto concatLayer = ctx->network()->addConcatenation(tensors.data(), tensors.size()); – The function call that fails. Printing tensors gives:

(gdb) print tensors
$2 = std::vector of length 0, capacity 0

Please let me know. Thanks

Hi @aravind.anantha,

Could you please take the problem layer out and share the onnx model which contains only this layer. So we can reproduce the issue.

Thank you.