How could I set the maxSeqLen of addRNNv2 when using dynamic shape?

I am trying to convert a CRNN model to TensorRT with dynamic shape. The function addRNNv2 need to specify the maxSeqLen. When I set maxSeqLen as the biggest number which maybe get, an error appeared as follow.

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addRNNCommon::572, condition: input.getDimensions().d[di.seqLen()] == maxSeqLen`

When I set maxSeqLen as -1, another error appeared as follow.

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addRNNCommon::570, condition: maxSeqLen > 0

How could I fix this bug? Is it acceptable to use dynamic shape when there are several lstm layers in the model?

Hi,

RNNv2 should support dynamic shape in TRT7.

Can you provide the following information so we can better help?
Provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow version
o TensorRT version
o If Jetson, OS, hw versions

Thanks

  • OS CentOS7
  • GPU GTX2080ti
  • driver 430.40
  • CUDA 1.0
  • CUDANN 7.6.5
  • TensorRT 7.0.0.11

Thanks for your reply. I extracted parameters from the pytorch model, the code for construct the graph is as follows. I forgot to say yesterday that the lstm I used is bidirectional.

ICudaEngine* createMNISTEngine(int maxBatchSize, IBuilder* builder, DataType dt, size_t input_h, size_t input_w, size_t label_cnt, std::string model_weights)
{
	INetworkDefinition* network = builder->createNetworkV2(1U << static_cast<int>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH));
	
	std::map<std::string, Weights> weightMap = loadWeights(model_weights);  // custom function
    std::cout << "load weights finished" << std::endl;
	
	ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims4{-1, 1, 16, -1});
    assert(data);
	
	// Create scale layer with default power/shift and specified scale parameter.
    const float scaleParam = 1; // 0.00390625;
    const Weights power{DataType::kFLOAT, nullptr, 0};
    const Weights shift{DataType::kFLOAT, nullptr, 0};
    const Weights scale{DataType::kFLOAT, &scaleParam, 1};
    IScaleLayer* scale_1 = network->addScale(*data, ScaleMode::kUNIFORM, shift, scale, power);
    assert(scale_1);
	
	// conv0 bn0 relu maxpool 
	scale_1->getOutput(0)->setName("conv0_input");
    IConvolutionLayer* conv0 = network->addConvolution(*scale_1->getOutput(0), 32, DimsHW{3, 3}, weightMap["ConvNet.conv0.weight"], weightMap["ConvNet.conv0.bias"]);
    assert(conv0);
    conv0->setStride(DimsHW{1, 1});
    conv0->setPadding(DimsHW{1, 1});
    IScaleLayer* bn0 = network->addScale(*conv0->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn0.shift"], weightMap["ConvNet.bn0.scale"], power);
    assert(bn0);
    IActivationLayer* relu0 = network->addActivation(*bn0->getOutput(0), ActivationType::kRELU);
    assert(relu0);
	IPoolingLayer* pool1 = network->addPooling(*relu0->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool0);
    pool0->setStride(DimsHW{2, 2});
    pool0->setPadding(DimsHW{0, 0});
	
    // conv1 bn1 relu maxpool 
    IConvolutionLayer* conv1 = network->addConvolution(*pool0->getOutput(0), 128, DimsHW{3, 3}, weightMap["ConvNet.conv1.weight"], weightMap["ConvNet.conv1.bias"]);
    assert(conv1);
    conv1->setStride(DimsHW{1, 1});
    conv1->setPadding(DimsHW{1, 1});
    IScaleLayer* bn1 = network->addScale(*conv1->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn1.shift"], weightMap["ConvNet.bn1.scale"], power);
    assert(bn1);
    IActivationLayer* relu1 = network->addActivation(*bn1->getOutput(0), ActivationType::kRELU);
    assert(relu1);
	IPoolingLayer* pool1 = network->addPooling(*relu1->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool1);
    pool1->setStride(DimsHW{2, 2});
    pool1->setPadding(DimsHW{0, 0});
	
	// conv2 bn2 relu maxpool
	IConvolutionLayer* conv2 = network->addConvolution(*pool1->getOutput(0), 256, DimsHW{3, 3}, weightMap["ConvNet.conv2.weight"], weightMap["ConvNet.conv2.bias"]);
    assert(conv2);
    conv1->setStride(DimsHW{1, 1});
    conv1->setPadding(DimsHW{1, 1});
    IScaleLayer* bn2 = network->addScale(*conv1->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn2.shift"], weightMap["ConvNet.bn2.scale"], power);
    assert(bn2);
    IActivationLayer* relu2 = network->addActivation(*bn2->getOutput(0), ActivationType::kRELU);
    assert(relu2);
	IPoolingLayer* pool2 = network->addPooling(*relu2->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool2);
    pool2->setStride(DimsHW{2, 2});
    pool2->setPadding(DimsHW{0, 0});
	
	// conv3 bn3 relu maxpool
	IConvolutionLayer* conv3 = network->addConvolution(*pool2->getOutput(0), 512, DimsHW{3, 3}, weightMap["ConvNet.conv3.weight"], weightMap["ConvNet.conv3.bias"]);
    assert(conv3);
    conv3->setStride(DimsHW{1, 1});
    conv3->setPadding(DimsHW{1, 1});
    IScaleLayer* bn3 = network->addScale(*conv3->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn3.shift"], weightMap["ConvNet.bn3.scale"], power);
    assert(bn3);
    IActivationLayer* relu3 = network->addActivation(*bn3->getOutput(0), ActivationType::kRELU);
    assert(relu3);
	IPoolingLayer* pool3 = network->addPooling(*relu3->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool3);
    pool3->setStride(DimsHW{2, 2});
    pool3->setPadding(DimsHW{0, 0});
	
	hidden_size = 512;
	// permute
    auto permuted_data = network->addShuffle(*pool3->getOutput(0));
    assert(permuted_data);
    permuted_data->setFirstTranspose(nvinfer1::Permutation{0, 3, 1, 2});
    permuted_data->setReshapeDimensions(Dims3{0, 0, -1});
    permuted_data->getOutput(0)->setName("visual_features");
	
	IRNNv2Layer* bilstm1 = network->addRNNv2(*permuted_data->getOutput(0), 1, hidden_size, permuted_data->getOutput(0)->getDimensions().d[1], RNNOperation::kLSTM);
    assert(bilstm1);  // error appears here
	addBiLSTM(bilstm1, hidden_size, hidden_size, 1, weight_ih, weight_hh, bias_ih, bias_hh, weight_ih_reverse, weight_hh_reverse, bias_ih_reverse, bias_hh_reverse);  // my custom function, there is no bug in it. 
	
	auto bilstm1_reshape = network->addShuffle(*bilstm1->getOutput(0));
    assert(bilstm1_reshape);
    bilstm1_reshape->setReshapeDimensions(Dims4{-1, 1024, 1, 1});
    auto bilstm1_linear = network->addFullyConnected(*bilstm1_reshape->getOutput(0), 512, weightMap["SequenceModeling.1.linear.weight"], weightMap["SequenceModeling.1.linear.bias"]);
    assert(bilstm1_linear);
    auto bilstm1_linear_reshape = network->addShuffle(*bilstm1_linear->getOutput(0));
    assert(bilstm1_linear_reshape);
    bilstm1_linear_reshape->setReshapeDimensions(Dims4{-1, 512, 1, 1});
	
	auto prd_linear = network->addFullyConnected(*bilstm1_linear_reshape->getOutput(0), label_cnt, weightMap["Prediction.weight"], weightMap["Prediction.bias"]);
    ISoftMaxLayer* prob = network->addSoftMax(*prd_linear->getOutput(0));
    assert(prob);
	
	auto permuted_output = network->addShuffle(*prob->getOutput(0));
    permuted_output->setReshapeDimensions(Dims3{-1, (int)times, label_cnt});
    auto prd = network->addTopK(*permuted_output->getOutput(0), nvinfer1::TopKOperation::kMAX, 1, 1<<2);
    auto output_layer = prd;
    
    // output prob
    output_layer->getOutput(0)->setName(OUTPUT_BLOB_PROB);
    network->markOutput(*output_layer->getOutput(0));
    // output index
    // output_layer->getOutput(1)->setName(OUTPUT_BLOB_INDEX);
    // network->markOutput(*output_layer->getOutput(1));
    // output_layer->getOutput(1)->setType(DataType::kINT32);

// Build engine
    // builder->setMaxBatchSize(maxBatchSize);
    // max memory can be used
    builder->setMaxWorkspaceSize(8000000000);
    
    nvinfer1::IBuilderConfig* config = builder->createBuilderConfig();
    IOptimizationProfile* profile = builder->createOptimizationProfile();
    profile->setDimensions(INPUT_BLOB_NAME, OptProfileSelector::kMIN, Dims4{1, 1, 16, 320});
    profile->setDimensions(INPUT_BLOB_NAME, OptProfileSelector::kOPT, Dims4{8, 1, 16, 640});
    profile->setDimensions(INPUT_BLOB_NAME, OptProfileSelector::kMAX, Dims4{maxBatchSize, 1, 16, 960});
    config->addOptimizationProfile(profile);
    
    ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
    
    // Don't need the network any more
    network->destroy();
    
    // Release host memory
    for (auto& mem: weightMap) 
    {
        free((void*) (mem.second.values));
    }
    return engine;
}

How could I add an LSTM layer into the model when using dynamic shape?

Hi,

Please refer to below link for sample example:
https://devblogs.nvidia.com/how-to-deploy-real-time-text-to-speech-applications-on-gpus-using-tensorrt/

Thanks

Thanks for your reply. I have converted my crnn model from onnx to tensorrt7.0 with dynamic shape successfully, but I found the speed is slower than the trt5.0 model converted from the original pytorch parameters. So I wanted to convert the crnn model from thr original pytorch parameters. Is there some methods to implement it with dynamic shape when there are several bilstm layers in the model?

On the other hand, can you help me to analysis the speed problem? In my opinion, the trt7.0 model should be faster than the trt5.0 model. I noticed that the graph of onnx model is different from the original pytorch model, a bilstm layer is converted to Gather, Unsqueeze, Concat, Slice and other operations. Are these operations more complex than the original bilstm forward procedure?

By the way, I found that TensorRT7.0 need more gpu memory than trt5.0 when converting models.

Hi

Hi,

It seems that RNNv2 still does not support wild card dimensions or explicit batch networks in TRT 7 as per the documentation.

https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_network_definition.html#a6cd3869f7406f73261857987be1b18a9

This is becoming a challenge primarily when we construct networks with both convolution and RNN layers with dynamic shapes (height and width for conv and sequence lengths for RNN). What is the timeframe for a uniform interface for dynamic shapes?