addRNNv2 error when using dynamic sequence length

I am using TensorRT7 to create a model which includes an RNN structrue, the code for adding an RNN layer is:

auto rnn = network->addRNNv2(
        *inputData,
        mNumLayers,
        mHiddenSize,
        1000,
        nvinfer1::RNNOperation::kLSTM);

the dimensions of inputData are:

nvinfer1::Dims inputDims{3, {-1, -1, 2688}, {nvinfer1::DimensionType::kINDEX, nvinfer1::DimensionType::kSEQUENCE, nvinfer1::DimensionType::kCHANNEL}};

when I compile the code, the following error is reported:

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addRNNCommon::572, condition: input.getDimensions().d[di.seqLen()] == maxSeqLen

I try to change the maxSeqLen parameter from 1000 to -1, compiles successfully, but fails to run:

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addRNNCommon::570, condition: maxSeqLen > 0

could somebody tell me how to solve this problem?

Hi,

Could you please share the script and model file so we can help better?
Also, can you provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow and PyTorch version
o TensorRT version

Thanks

Hi, SunilJB, this is the platforms information :

  • Ubuntu 16.04.6 LTS (Xenial Xerus)
  • GeForce RTX 2080 Ti
  • 295.41
  • cuda10.0
  • cuda7.6
  • tensorrt7.0

I got the same problem, but havn’t found a solution.

Hi,

Could you please check if network definition are created with the explicitBatch flag set?
Also, if possible please share the script file so we can better help.

Thanks

yes, I have set the explicitBatch flag:

nvinfer1::ICudaEngine* getBRNNEngine(asrSample::LSTM::ptr brnn)
{
    nvinfer1::IBuilder* builder = nvinfer1::createInferBuilder(gLogger);
    nvinfer1::IBuilderConfig* config = builder->createBuilderConfig();
    builder->setMaxBatchSize(gMaxBatchSize);
    config->setMaxWorkspaceSize(gMaxWorkspaceSize);
    if (gFp16) {
        config->setFlag(nvinfer1::BuilderFlag::kFP16);
        config->setFlag(BuilderFlag::kSTRICT_TYPES);
        builder->setFp16Mode(true);
    }   

    nvinfer1::INetworkDefinition* network = builder->createNetworkV2(1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH));

    nvinfer1::Dims inputDims{3, {-1, -1, 2688}, {nvinfer1::DimensionType::kINDEX, nvinfer1::DimensionType::kSEQUENCE, nvinfer1::DimensionType::kCHANNEL}};
    nvinfer1::Dims stateDims = brnn->getStateDims();
    nvinfer1::Dims sequenceLengthDims{0, {}, {}};

    auto inputTensor = network->addInput("encoder_brnn_input_data", nvinfer1::DataType::kFLOAT, inputDims);
    auto sequenceLengthTensor = network->addInput("encoder_sequence_length", nvinfer1::DataType::kINT32, sequenceLengthDims);
    auto hiddenStateTensor = network->addInput("encoder_hidden_state", nvinfer1::DataType::kFLOAT, stateDims);
    auto cellStateTensor = network->addInput("encoder_cell_state", nvinfer1::DataType::kFLOAT, stateDims);

    nvinfer1::ITensor *outputState, *lastHiddenState, *linearOutput, *actOutput;

    brnn->addToModel(network, inputTensor, sequenceLengthTensor, hiddenStateTensor, cellStateTensor, &outputState, &lastHiddenState);
    outputState->setName("brnn_output");
    network->markOutput(*outputState);

    auto profile = builder->createOptimizationProfile();
    profile->setDimensions(inputTensor->getName(), OptProfileSelector::kMIN, Dims3{1, 1, 2688});
    profile->setDimensions(inputTensor->getName(), OptProfileSelector::kOPT, Dims3{50, 150, 2688});
    profile->setDimensions(inputTensor->getName(), OptProfileSelector::kMAX, Dims3{100, 300, 2688});
    config->addOptimizationProfile(profile);

    samplesCommon::enableDLA(builder, config, gUseDLACore);
    auto res = builder->buildEngineWithConfig(*network, *config);

    network->destroy();
    builder->destroy();
    config->destroy();

    return res;
}

void LSTM::addToModel(
    nvinfer1::INetworkDefinition* network,
    nvinfer1::ITensor* inputData,
    nvinfer1::ITensor* sequenceLength,
    nvinfer1::ITensor* hiddenState,
    nvinfer1::ITensor* cellState,
    nvinfer1::ITensor** outputState,
    nvinfer1::ITensor** lastHiddenState)
{
    //int maxSeqLen = inputData->getDimensions().d[0];
    int maxSeqLen = -1;
    auto rnn = network->addRNNv2(
        *inputData,
        mNumLayers,
        mHiddenSize,
        maxSeqLen,
        nvinfer1::RNNOperation::kLSTM);
    assert(rnn != nullptr);

    rnn->setInputMode(nvinfer1::RNNInputMode::kLINEAR);
    rnn->setDirection(nvinfer1::RNNDirection::kBIDIRECTION);
    rnn->setSequenceLengths(*sequenceLength);

    std::vector<nvinfer1::RNNGateType> gateOrder({nvinfer1::RNNGateType::kINPUT,
                                                  nvinfer1::RNNGateType::kFORGET,
                                                  nvinfer1::RNNGateType::kCELL,
                                                  nvinfer1::RNNGateType::kOUTPUT});
    for (size_t i = 0; i < mGateKernelWeights.size(); i++)
    {
        bool isW = ((i%8) < 4);
        rnn->setWeightsForGate(i/8, gateOrder[i % 4], isW, mGateKernelWeights[i]);
        rnn->setBiasForGate(i/8, gateOrder[i % 4], isW, mGateBiasWeights[i]);
    }

    rnn->setHiddenState(*hiddenState);
    rnn->setCellState(*cellState);

    *outputState = rnn->getOutput(0);
    *lastHiddenState = rnn->getOutput(1);
}

Also, I find that if using dynamic batch dimension, the memory usage will increased quickly,it is very easy to out of gpu memory.

Hi,

Try the ILoop interface introduced in TensorRT 7 and not our legacy RNN interfaces.
Please refer below sample for iLoop implementation:
https://github.com/NVIDIA/TensorRT/blob/572d54f91791448c015e74a4f1d6923b77b79795/samples/opensource/sampleCharRNN/sampleCharRNN.cpp

Thanks

Hi,

I want to use addRNNv2 interface, because it’s simple to use, the ILoop interface is a little complicated. However, I will try ILoop interface, hope that addRNNv2 interface supports dynamic shape in the future.

Thanks

Hi SunilJB,

I have some issues about the sample you sent to me:
https://github.com/NVIDIA/TensorRT/blob/572d54f91791448c015e74a4f1d6923b77b79795/samples/opensource/sampleCharRNN/sampleCharRNN.cpp

when this sample uses iLoop, the network definition are created with the explicitBatch flag set, but the dimensions of input tensor have no explicit batch dimension, and I have not see where the batch size is set:

nvinfer1::ITensor* data = network->addInput(mParams.bindingNames.INPUT_BLOB_NAME, nvinfer1::DataType::kFLOAT,nvinfer1::Dims2(mParams.seqSize, mParams.dataSize));

Thanks

In addition, ILoop interface also needs input max sequence length tensor, which should be const value, so the problem is same as addBRNNv2 interface:

nvinfer1::ITensor* maxSequenceSize = network->addConstant(nvinfer1::Dims{}, Weights{DataType::kINT32, &mParams.seqSize, 1})->getOutput(0);

Hi SunilJB,

According to DEFINE_BUILTIN_OP_IMPORTER(LSTM) , I know how to use ILoop interface for supporting dynamic sequence length:
https://github.com/onnx/onnx-tensorrt/blob/master/builtin_op_importers.cpp

Thanks!