TensorRT7 infer time is longer with dynamic shape in C++ compared with python

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 7
GPU Type: 1080ti
Nvidia Driver Version:410.78
CUDA Version: 10.0
CUDNN Version: 7.6.5
Operating System + Version: Ubuntu16.04


  • ONNX IR version: 0.0.6
    Opset version: 11
    Producer name: NVIDIA TensorRT sample
    Producer version:
    Domain:
    Model version: 0
    Doc string:

Steps To Reproduce

When using the dynamic shape for inference, the infer time is much longer using C++ interface compared with python inferface.The infer results are all right.
The engine is the same used in C++ code and python code.
The Optimization Profile is added,and the Binding Dimensions are set before inference.

    auto profile = builder->createOptimizationProfile();
    auto input = network->getInput(0);
    Dims dims = input->getDimensions();
    // set dynamic input tensors
    dims.d[0] = 1;
    profile->setDimensions(input->getName(), OptProfileSelector::kMIN, dims);
    dims.d[0] = std::max(m_config.maxBatchSize, 1);
    profile->setDimensions(input->getName(), OptProfileSelector::kOPT, dims);
    dims.d[0] = std::max(m_config.maxBatchSize+1, 1);
    profile->setDimensions(input->getName(), OptProfileSelector::kMAX, dims);
    // and optimization profile
    config->addOptimizationProfile(profile);


    Dims4 inputDims{m_config.maxBatchSize, 3, m_config.inputWidth,  m_config.inputHeight };
    // Set the input size for the preprocessor
    m_pContext->setBindingDimensions(0, inputDims);
    // specify full dynamic input
    if (!m_pContext->allInputDimensionsSpecified()) {
    return -1;
    }

Hi,

How much is the infer time difference between the C++ and python interface?
Could you please share the model and script file to reproduce the issue so we can help better?

Thanks

The time using C++ interface is 26 ms,and the python infer time is 8 ms.
Can you give me an email to send the mode and script file?

Hi,

Can you send the files as message in the forum?
Meanwhile, you can use Nvidia profiler to debug C++ interface performance:

Thanks

Thank you.I have fixed the bug.
Because the function setBindingDimensions is called every time, the every infer runs in initialized state, then the infer time is so long.