Question about using optimization profiles: bindingIndex 0 is not in profile 1

cyruspk4w6 · November 4, 2021, 6:27pm

My network has input shape (-1, 3, 112, 112) meaning it supports dynamic batch sizes.
I am registering several optimization profiles as follows. You can assume that m_options.optBatchSizes = std::vector<unsigned int>{2, 4, 8};

// Specify the optimization profiles and the
    IOptimizationProfile* defaultProfile = builder->createOptimizationProfile();
    defaultProfile->setDimensions(inputName, OptProfileSelector::kMIN, Dims4(1, m_inputC, m_inputH, m_inputW));
    defaultProfile->setDimensions(inputName, OptProfileSelector::kOPT, Dims4(1, m_inputC, m_inputH, m_inputW));
    defaultProfile->setDimensions(inputName, OptProfileSelector::kMAX, Dims4(m_options.maxBatchSize, m_inputC, m_inputH, m_inputW));
    config->addOptimizationProfile(defaultProfile);

    // Specify all the optimization profiles.
    for (const auto& optBatchSize: m_options.optBatchSizes) {
        if (optBatchSize == 1) {
            continue;
        }

        if (optBatchSize > m_options.maxBatchSize) {
            throw std::runtime_error("optBatchSize cannot be greater than maxBatchSize!");
        }

        IOptimizationProfile* profile = builder->createOptimizationProfile();
        profile->setDimensions(inputName, OptProfileSelector::kMIN, Dims4(1, m_inputC, m_inputH, m_inputW));
        profile->setDimensions(inputName, OptProfileSelector::kOPT, Dims4(optBatchSize, m_inputC, m_inputH, m_inputW));
        profile->setDimensions(inputName, OptProfileSelector::kMAX, Dims4(m_options.maxBatchSize, m_inputC, m_inputH, m_inputW));
        config->addOptimizationProfile(profile);
    }

Later in my code when I am running inference, I am trying to switch the optimization profile based on the batch size. If the batch size matches one of the registered profiles, then I want to switch the profile. You can assume that m_optProfIndx is a std::unordered_map<int, int> mapping the batch size to the registered profileIndex.

        // Determine if the batch size is in our optimization profile
        auto it = m_optProfIndx.find(batchSize);
        if (it != m_optProfIndx.end()) {
//             Switch the optimization profile
            m_context->setOptimizationProfileAsync(it->second, m_cudaStream);
        }
        m_context->setBindingDimensions(0, inputDims);

When I try running my code, it crashed with error message:

IExecutionContext::setBindingDimensions: bindingIndex 0 is not in profile 1. Using bindingIndex = 2 instead.
1: [convolutionRunner.cpp::executeConv::458] Error Code 1: Cudnn (CUDNN_STATUS_EXECUTION_FAILED)
terminate called after throwing an instance of 'std::runtime_error'

It looks like it is crashing on the call to m_context->setBindingDimensions(0, inputDims);.
For reference, inputDims is defined as follows: Dims4 inputDims = {static_cast<int32_t>(inputFaceChips.size()), m_inputC, m_inputH, m_inputW};, where inputFaceChips.size() is the current batch size.

NVES · November 5, 2021, 7:12am

Hi,
Please check the below link, as they might answer your concerns

Thanks!

cyruspk4w6 · November 5, 2021, 2:16pm

Hi there, I’ve already checked the docs many times (anytime I open an issue I simply get referred to this same link which honestly isn’t very helpful). I’d appreciate if you’d spend 5 mins to actually review the code and advise on any issues. Thank you in advance.

cyruspk4w6 · November 15, 2021, 9:22pm

Any updates??

cyruspk4w6 · November 17, 2021, 5:57pm

@NVES do you know how to solve the problem??
Once again, please don’t just refer me to irrelevant docs pages.
I have other colleagues that have experienced this same problem, so I’m sure it would be helpful to many people if you took the time to describe how to do this properly, as none of your sample code goes over switching optimization profiles.

AakankshaS · November 18, 2021, 9:37am

Hi @cyruspk4w6 ,
Apologies for the delay,
Can you please share with us the model and script to debug it further.

Thanks!

cyruspk4w6 · November 18, 2021, 7:18pm

Here is the header file: https://github.com/cyrusbehr/tensorrt-cpp-api/blob/opt_profiles/src/engine.h
and the implementation file: https://github.com/cyrusbehr/tensorrt-cpp-api/blob/opt_profiles/src/engine.cpp

Here is the most relevant part:

Creation of optimization profiles when building the network (https://github.com/cyrusbehr/tensorrt-cpp-api/blob/cc52324490bdf994bcc068f2dc1d2a2df6b5da5c/src/engine.cpp#L93-L115)

// Specify the optimization profiles and the
IOptimizationProfile* defaultProfile = builder->createOptimizationProfile();
defaultProfile->setDimensions(inputName, OptProfileSelector::kMIN, Dims4(1, inputC, inputH, inputW));
defaultProfile->setDimensions(inputName, OptProfileSelector::kOPT, Dims4(1, inputC, inputH, inputW));
defaultProfile->setDimensions(inputName, OptProfileSelector::kMAX, Dims4(m_options.maxBatchSize, inputC, inputH, inputW));
config->addOptimizationProfile(defaultProfile);

// Specify all the optimization profiles.
for (const auto& optBatchSize: m_options.optBatchSizes) {
if (optBatchSize == 1) {
continue;
}

if (optBatchSize > m_options.maxBatchSize) {
throw std::runtime_error("optBatchSize cannot be greater than maxBatchSize!");
}

IOptimizationProfile* profile = builder->createOptimizationProfile();
profile->setDimensions(inputName, OptProfileSelector::kMIN, Dims4(1, inputC, inputH, inputW));
profile->setDimensions(inputName, OptProfileSelector::kOPT, Dims4(optBatchSize, inputC, inputH, inputW));
profile->setDimensions(inputName, OptProfileSelector::kMAX, Dims4(m_options.maxBatchSize, inputC, inputH, inputW));
config->addOptimizationProfile(profile);
}

Trying to load the specific optimization profile (https://github.com/cyrusbehr/tensorrt-cpp-api/blob/cc52324490bdf994bcc068f2dc1d2a2df6b5da5c/src/engine.cpp#L209-L234):

if (m_prevBatchSize != inputFaceChips.size()) {

m_inputBuff.hostBuffer.resize(inputDims);
m_inputBuff.deviceBuffer.resize(inputDims);

Dims2 outputDims {batchSize, outputL};
m_outputBuff.hostBuffer.resize(outputDims);
m_outputBuff.deviceBuffer.resize(outputDims);

m_prevBatchSize = batchSize;

// Determine if the batch size is in our optimization profile
auto it = m_optProfIdx.find(batchSize);
if (it != m_optProfIdx.end()) {
// Switch the optimization profile
m_context->setOptimizationProfileAsync(it->second, m_cudaStream);
m_profileIdx = it->second;
}
std::string inputTensorName = "input [profile " + std::to_string(m_profileIdx) + "]";
if (m_profileIdx == 0) {
m_context->setBindingDimensions(0, inputDims);
} else {
auto bindingIdx = m_engine->getBindingIndex(inputTensorName.c_str());
m_context->setBindingDimensions(bindingIdx, inputDims);
}
}

spolisetty · November 21, 2021, 4:01pm

Hi,

Could you please give more details on your env setup.

Environment

TensorRT Version :
GPU Type :
Nvidia Driver Version :
CUDA Version :
CUDNN Version :
Operating System + Version :
Python Version (if applicable) :
TensorFlow Version (if applicable) :
PyTorch Version (if applicable) :
Baremetal or Container (if container which image + tag) :

cyruspk4w6 · December 6, 2021, 7:27pm

Hi there, here is the requested information:

Environment

TensorRT Version : TensorRT-8.0.3.4
GPU Type : NVIDIA GeForce RTX 3080 Laptop GPU
Nvidia Driver Version : 495.29.05
CUDA Version : 11.5
Operating System + Version : Ubuntu 20.04.3 LTS

cyruspk4w6 · December 9, 2021, 9:35pm

@NVES @AakankshaS @spolisetty
Any updates? This issue has not remained unresolved for over 1 month.
I apologize but please understand my frustration, I just want to get this resolved, and I don’t image it’s a very difficult issue either.

cyruspk4w6 · December 14, 2021, 6:32pm

Any updates??

cyruspk4w6 · December 21, 2021, 10:36pm

Any updates? This issue has now been unresolved for nearly 2 months, I have shared all requested resources.

spolisetty · January 10, 2022, 5:24am

Hi @cyruspk4w6,

Sorry for the delay in addressing this issue, Our team is looking into this issue.
Will get back to you soon.

Thank you.

spolisetty · January 17, 2022, 8:54am

Hi,

The binding layout looks like:
binding0_profile0, binding1_profile0, binding0_profile1, binding1_profile1, …
the index for above sequence is :
0, 1, 2, 3, …

Please refer following for the binding index.

github.com

NVIDIA/TensorRT/blob/main/demo/BERT/inference.py#L133


      
          # Import necessary plugins for demoBERT
          plugin_lib_name = "nvinfer_plugin.dll" if sys.platform == "win32" else "libnvinfer_plugin.so"
          env_name_to_add_path = "PATH" if sys.platform == "win32" else "LD_LIBRARY_PATH"
          handle = ctypes.CDLL(plugin_lib_name, mode=ctypes.RTLD_GLOBAL)
          if not handle:
              raise RuntimeError("Could not load plugin library. Is `{}` on your {}?".format(plugin_lib_name, env_name_to_add_path))
          
          
# The first context created will use the 0th profile. A new context must be created
          # for each additional profile needed. Here, we only use batch size 1, thus we only need the first profile.
          with open(args.engine, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime, \
              runtime.deserialize_cuda_engine(f.read()) as engine, engine.create_execution_context() as context:
          
          
    # select engine profile
              selected_profile = -1
              num_binding_per_profile = engine.num_bindings // engine.num_optimization_profiles
              for idx in range(engine.num_optimization_profiles):
                  profile_shape = engine.get_profile_shape(profile_index = idx, binding = idx * num_binding_per_profile)
                  if profile_shape[0][0] <= args.batch_size and profile_shape[2][0] >= args.batch_size and profile_shape[0][1] <= max_seq_length and profile_shape[2][1] >= max_seq_length:
                      selected_profile = idx
                      break
              if selected_profile == -1:

Thank you.

Topic		Replies	Views
check allInputDimensionsSpecified() for second profile fail TensorRT	9	5603	October 12, 2021
setOptimizationProfile crashed Deep Learning (Training & Inference)	0	279	June 3, 2020
enqueueV2() for second profile fail TensorRT	3	1793	October 12, 2021
TensorRT dynamic shape err: [slot.h::decode::151] Error Code 2: Internal Error (Assertion index < nbSlots failed.invalid encoded reference to a slot) TensorRT tensorrt	18	1814	October 4, 2021
No optimization profile has been defined TensorRT	4	1378	August 31, 2020
How to use different profile in tensorrt? TensorRT tensorrt , python	3	1512	July 19, 2022
Optimization profile not set after creating context {mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles()} TensorRT	3	1042	January 18, 2023
Failed to switch optimisation profiles for execution context on resizable models Jetson TX2 neural-network-framework	6	1377	October 18, 2021
Questions about frequently changing the batch size TensorRT tensorrt	2	562	October 12, 2021
How could I change the batchsize during inference when using a tensorRT model converted by onnx? TensorRT	8	4833	October 12, 2021

Question about using optimization profiles: bindingIndex 0 is not in profile 1

Environment

Environment

Related topics