API Usage Error Due to Dynamic Shapes in o/p layers


I have generated an ssd_model.trt engine file from ssd_model.onnx. However, when I use the TRT (TensorRT) file and run an inference, I encounter the following API usage error at the enqueue stage.

[executionContext.cpp::enqueueInternal::795] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::enqueueInternal::795, condition: bindings || nullBindingOK )

Could someone kindly assist me by listing the steps required to perform inference on a model with dynamic shapes? Currently, I am computing the binding size for output buffers using the following approach:

size_t getSizeByDim(const nvinfer1::Dims &dims)
    size_t size = 1;
    for (size_t i = 0; i < dims.nbDims; ++i)
        size *= dims.d[i];
    return size;

The issue arises because my output layers have dynamic shapes, causing the number of boxes (nbox) to be initially set to -1. Consequently, when calculating the binding size to allocate memory to the device, it yields a large value.

m_engine->getBindingDimensions(i): 1 3 1200 1200 (input 1)
binding_size 17280000
m_engine->getBindingDimensions(i): 1 -1 4 (output 1)
binding_size 18446744073709551600
m_engine->getBindingDimensions(i): 1 -1 (output 2)
binding_size 18446744073709551612
m_engine->getBindingDimensions(i): 1 -1 (output 3)
binding_size 18446744073709551612

name: bboxes

tensor: float32[1,nbox,4]

name: labels

tensor: int64[1,nbox]

name: scores

tensor: float32[1,nbox]

One solution that I got is making the binding size static based on max detections but that isn’t memory efficient.

Any insights or suggestions on how to address this issue would be greatly appreciated.


TensorRT Version:
GPU Type: NVIDIA RTX A3000 Laptop GPU
Nvidia Driver Version: 528.89
CUDA Version: 11.2
CUDNN Version: 8.3.3
Operating System + Version: Linux WSL2
TensorFlow Version (if applicable): 2.13.1

Hi @v.srinivassunny7 ,
can you please help us with the model and repro scripts to assist you better.