How could I set the maxSeqLen of addRNNv2 when using dynamic shape?

dsq0720 · January 16, 2020, 10:26am

I am trying to convert a CRNN model to TensorRT with dynamic shape. The function addRNNv2 need to specify the maxSeqLen. When I set maxSeqLen as the biggest number which maybe get, an error appeared as follow.

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addRNNCommon::572, condition: input.getDimensions().d[di.seqLen()] == maxSeqLen`

When I set maxSeqLen as -1, another error appeared as follow.

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addRNNCommon::570, condition: maxSeqLen > 0

How could I fix this bug? Is it acceptable to use dynamic shape when there are several lstm layers in the model?

SunilJB · January 16, 2020, 4:52pm

Hi,

RNNv2 should support dynamic shape in TRT7.

Can you provide the following information so we can better help?
Provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow version
o TensorRT version
o If Jetson, OS, hw versions

Thanks

dsq0720 · January 17, 2020, 2:26am

OS CentOS7
GPU GTX2080ti
driver 430.40
CUDA 1.0
CUDANN 7.6.5
TensorRT 7.0.0.11

Thanks for your reply. I extracted parameters from the pytorch model, the code for construct the graph is as follows. I forgot to say yesterday that the lstm I used is bidirectional.

ICudaEngine* createMNISTEngine(int maxBatchSize, IBuilder* builder, DataType dt, size_t input_h, size_t input_w, size_t label_cnt, std::string model_weights)
{
	INetworkDefinition* network = builder->createNetworkV2(1U << static_cast<int>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH));
	
	std::map<std::string, Weights> weightMap = loadWeights(model_weights);  // custom function
    std::cout << "load weights finished" << std::endl;
	
	ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims4{-1, 1, 16, -1});
    assert(data);
	
	// Create scale layer with default power/shift and specified scale parameter.
    const float scaleParam = 1; // 0.00390625;
    const Weights power{DataType::kFLOAT, nullptr, 0};
    const Weights shift{DataType::kFLOAT, nullptr, 0};
    const Weights scale{DataType::kFLOAT, &scaleParam, 1};
    IScaleLayer* scale_1 = network->addScale(*data, ScaleMode::kUNIFORM, shift, scale, power);
    assert(scale_1);
	
	// conv0 bn0 relu maxpool 
	scale_1->getOutput(0)->setName("conv0_input");
    IConvolutionLayer* conv0 = network->addConvolution(*scale_1->getOutput(0), 32, DimsHW{3, 3}, weightMap["ConvNet.conv0.weight"], weightMap["ConvNet.conv0.bias"]);
    assert(conv0);
    conv0->setStride(DimsHW{1, 1});
    conv0->setPadding(DimsHW{1, 1});
    IScaleLayer* bn0 = network->addScale(*conv0->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn0.shift"], weightMap["ConvNet.bn0.scale"], power);
    assert(bn0);
    IActivationLayer* relu0 = network->addActivation(*bn0->getOutput(0), ActivationType::kRELU);
    assert(relu0);
	IPoolingLayer* pool1 = network->addPooling(*relu0->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool0);
    pool0->setStride(DimsHW{2, 2});
    pool0->setPadding(DimsHW{0, 0});
	
    // conv1 bn1 relu maxpool 
    IConvolutionLayer* conv1 = network->addConvolution(*pool0->getOutput(0), 128, DimsHW{3, 3}, weightMap["ConvNet.conv1.weight"], weightMap["ConvNet.conv1.bias"]);
    assert(conv1);
    conv1->setStride(DimsHW{1, 1});
    conv1->setPadding(DimsHW{1, 1});
    IScaleLayer* bn1 = network->addScale(*conv1->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn1.shift"], weightMap["ConvNet.bn1.scale"], power);
    assert(bn1);
    IActivationLayer* relu1 = network->addActivation(*bn1->getOutput(0), ActivationType::kRELU);
    assert(relu1);
	IPoolingLayer* pool1 = network->addPooling(*relu1->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool1);
    pool1->setStride(DimsHW{2, 2});
    pool1->setPadding(DimsHW{0, 0});
	
	// conv2 bn2 relu maxpool
	IConvolutionLayer* conv2 = network->addConvolution(*pool1->getOutput(0), 256, DimsHW{3, 3}, weightMap["ConvNet.conv2.weight"], weightMap["ConvNet.conv2.bias"]);
    assert(conv2);
    conv1->setStride(DimsHW{1, 1});
    conv1->setPadding(DimsHW{1, 1});
    IScaleLayer* bn2 = network->addScale(*conv1->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn2.shift"], weightMap["ConvNet.bn2.scale"], power);
    assert(bn2);
    IActivationLayer* relu2 = network->addActivation(*bn2->getOutput(0), ActivationType::kRELU);
    assert(relu2);
	IPoolingLayer* pool2 = network->addPooling(*relu2->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool2);
    pool2->setStride(DimsHW{2, 2});
    pool2->setPadding(DimsHW{0, 0});
	
	// conv3 bn3 relu maxpool
	IConvolutionLayer* conv3 = network->addConvolution(*pool2->getOutput(0), 512, DimsHW{3, 3}, weightMap["ConvNet.conv3.weight"], weightMap["ConvNet.conv3.bias"]);
    assert(conv3);
    conv3->setStride(DimsHW{1, 1});
    conv3->setPadding(DimsHW{1, 1});
    IScaleLayer* bn3 = network->addScale(*conv3->getOutput(0), ScaleMode::kCHANNEL, weightMap["ConvNet.bn3.shift"], weightMap["ConvNet.bn3.scale"], power);
    assert(bn3);
    IActivationLayer* relu3 = network->addActivation(*bn3->getOutput(0), ActivationType::kRELU);
    assert(relu3);
	IPoolingLayer* pool3 = network->addPooling(*relu3->getOutput(0), PoolingType::kMAX, DimsHW{2, 2});
    assert(pool3);
    pool3->setStride(DimsHW{2, 2});
    pool3->setPadding(DimsHW{0, 0});
	
	hidden_size = 512;
	// permute
    auto permuted_data = network->addShuffle(*pool3->getOutput(0));
    assert(permuted_data);
    permuted_data->setFirstTranspose(nvinfer1::Permutation{0, 3, 1, 2});
    permuted_data->setReshapeDimensions(Dims3{0, 0, -1});
    permuted_data->getOutput(0)->setName("visual_features");
	
	IRNNv2Layer* bilstm1 = network->addRNNv2(*permuted_data->getOutput(0), 1, hidden_size, permuted_data->getOutput(0)->getDimensions().d[1], RNNOperation::kLSTM);
    assert(bilstm1);  // error appears here
	addBiLSTM(bilstm1, hidden_size, hidden_size, 1, weight_ih, weight_hh, bias_ih, bias_hh, weight_ih_reverse, weight_hh_reverse, bias_ih_reverse, bias_hh_reverse);  // my custom function, there is no bug in it. 
	
	auto bilstm1_reshape = network->addShuffle(*bilstm1->getOutput(0));
    assert(bilstm1_reshape);
    bilstm1_reshape->setReshapeDimensions(Dims4{-1, 1024, 1, 1});
    auto bilstm1_linear = network->addFullyConnected(*bilstm1_reshape->getOutput(0), 512, weightMap["SequenceModeling.1.linear.weight"], weightMap["SequenceModeling.1.linear.bias"]);
    assert(bilstm1_linear);
    auto bilstm1_linear_reshape = network->addShuffle(*bilstm1_linear->getOutput(0));
    assert(bilstm1_linear_reshape);
    bilstm1_linear_reshape->setReshapeDimensions(Dims4{-1, 512, 1, 1});
	
	auto prd_linear = network->addFullyConnected(*bilstm1_linear_reshape->getOutput(0), label_cnt, weightMap["Prediction.weight"], weightMap["Prediction.bias"]);
    ISoftMaxLayer* prob = network->addSoftMax(*prd_linear->getOutput(0));
    assert(prob);
	
	auto permuted_output = network->addShuffle(*prob->getOutput(0));
    permuted_output->setReshapeDimensions(Dims3{-1, (int)times, label_cnt});
    auto prd = network->addTopK(*permuted_output->getOutput(0), nvinfer1::TopKOperation::kMAX, 1, 1<<2);
    auto output_layer = prd;
    
    // output prob
    output_layer->getOutput(0)->setName(OUTPUT_BLOB_PROB);
    network->markOutput(*output_layer->getOutput(0));
    // output index
    // output_layer->getOutput(1)->setName(OUTPUT_BLOB_INDEX);
    // network->markOutput(*output_layer->getOutput(1));
    // output_layer->getOutput(1)->setType(DataType::kINT32);

// Build engine
    // builder->setMaxBatchSize(maxBatchSize);
    // max memory can be used
    builder->setMaxWorkspaceSize(8000000000);
    
    nvinfer1::IBuilderConfig* config = builder->createBuilderConfig();
    IOptimizationProfile* profile = builder->createOptimizationProfile();
    profile->setDimensions(INPUT_BLOB_NAME, OptProfileSelector::kMIN, Dims4{1, 1, 16, 320});
    profile->setDimensions(INPUT_BLOB_NAME, OptProfileSelector::kOPT, Dims4{8, 1, 16, 640});
    profile->setDimensions(INPUT_BLOB_NAME, OptProfileSelector::kMAX, Dims4{maxBatchSize, 1, 16, 960});
    config->addOptimizationProfile(profile);
    
    ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
    
    // Don't need the network any more
    network->destroy();
    
    // Release host memory
    for (auto& mem: weightMap) 
    {
        free((void*) (mem.second.values));
    }
    return engine;
}

How could I add an LSTM layer into the model when using dynamic shape?

SunilJB · January 20, 2020, 6:12am

Hi,

Please refer to below link for sample example:
https://devblogs.nvidia.com/how-to-deploy-real-time-text-to-speech-applications-on-gpus-using-tensorrt/

Thanks

dsq0720 · January 20, 2020, 6:52am

Thanks for your reply. I have converted my crnn model from onnx to tensorrt7.0 with dynamic shape successfully, but I found the speed is slower than the trt5.0 model converted from the original pytorch parameters. So I wanted to convert the crnn model from thr original pytorch parameters. Is there some methods to implement it with dynamic shape when there are several bilstm layers in the model?

On the other hand, can you help me to analysis the speed problem? In my opinion, the trt7.0 model should be faster than the trt5.0 model. I noticed that the graph of onnx model is different from the original pytorch model, a bilstm layer is converted to Gather, Unsqueeze, Concat, Slice and other operations. Are these operations more complex than the original bilstm forward procedure?

By the way, I found that TensorRT7.0 need more gpu memory than trt5.0 when converting models.

hariprasad.ravishankar · February 26, 2020, 2:18pm

Hi

Hi,

It seems that RNNv2 still does not support wild card dimensions or explicit batch networks in TRT 7 as per the documentation.

https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_network_definition.html#a6cd3869f7406f73261857987be1b18a9

This is becoming a challenge primarily when we construct networks with both convolution and RNN layers with dynamic shapes (height and width for conv and sequence lengths for RNN). What is the timeframe for a uniform interface for dynamic shapes?

Topic		Replies	Views
Issues with dynamic shapes Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() TensorRT	6	1413	June 14, 2022
addRNNv2 error when using dynamic sequence length TensorRT	10	1253	March 2, 2020
TensorRT: Cannot set bindings for dynamic shapes TensorRT	4	5587	October 12, 2021
No result when using tensorRT Sample FasterRCNN with other images Jetson TX2	43	5942	October 18, 2021
Feasibility of SSD, YOLO models on TensorRT and Deepstream? TensorRT	17	1162	July 31, 2020
OOM error with Shuffle and topK layer tensorRT 7.0 TensorRT	8	1685	December 8, 2020
problem adding custom TensorRT layer to a network defined using TensorRT API TensorRT	5	1517	May 15, 2018
Can't accelerate BIDIRECTION rnn(lstm) in TensorRT5.1.5.0 TensorRT	1	380	November 21, 2020
Problems about Unsupported operation _AddV2 in mobilenet tensorrt Jetson Nano tensorrt	11	1168	October 18, 2021
TensorRT7 run explicit shape model much slower than TensorRT5 TensorRT	6	664	August 12, 2021

How could I set the maxSeqLen of addRNNv2 when using dynamic shape?

Related topics