working with dynamic shape example?

Hi, I am still in confusion after I read the tensorRT doc about " Working with dynamic shape" so I have try sth. below.

(1) get the mobilenetv2-1.0.onnx from model zoo, which has static input of [1, 3, 224, 224]

(2)parse the model and create engine file, but error happened:

[TensorRT] ERROR: data: kOPT dimensions in profile 0 are [1,3,256,256] but input has static dimensions [1,3,224,224].

The question is how should I create the onnx model if I want to use dynamic shape? Should the onnx model with input [1, 3, -1, -1]

And I try to change the onnx input shape using method like below:

It seemed to be wrong with op global average pooling, which required specific spatial dimensions


import numpy as np
import pycuda.driver as cuda
import pycuda.autoinit
import time
import tensorrt as trt

import sys, os
import common

# You can set the logger severity higher to suppress messages (or lower to display more messages).
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

class ModelData(object):
    MODEL_FILE = "mobilenetv2-1.0.onnx"

def build_engine(model_file):
    # For more information on TRT basics, refer to the introductory samples.
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = common.GiB(2)
        builder.max_batch_size = 1
        #network.add_input('image_batch', trt.float32, (-1, 112, 112, 3))
        # Parse the onnx network
        with open(model_file, 'rb') as model:
        config = builder.create_builder_config()
        profile = builder.create_optimization_profile()
        profile.set_shape('data', (1, 3, 224, 224), (1, 3, 256, 256), (1, 3, 512, 512))
        return builder.build_engine(network, config)

# Loads a test case into the provided pagelocked_buffer.
def load_normalized_test_case(data_paths, pagelocked_buffer):
    img = np.random.rand(1, 3, 256, 256).astype(np.float32)
    pagelocked_buffer = img
    return img

def main():
    with build_engine(ModelData.MODEL_FILE) as engine:
        # Build an engine, allocate buffers and create a stream.
        # For more information on buffer allocation, refer to the introductory samples
        #with open('pfld.trt', 'wb') as f:
        #    f.write(engine.serialize())
        inputs, outputs, bindings, stream = common.allocate_buffers(engine)
        with engine.create_execution_context() as context:
            Ishape = load_normalized_test_case('test.jpeg', pagelocked_buffer=inputs[0].host)
            context.active_optimization_profile = 0
            context.set_binding_shape(0, (1, 3, 256, 256))
            # For more information on performing inference, refer to the introductory samples.
            # The common.do_inference function will return a list of outputs - we only have one in this case.
            for i in range(1):
                t = time.time()
                [keys] = common.do_inference(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream, batch_size=1)

if __name__ == '__main__':


Optimization profiles are meant to be used with dynamic shape. I believe you can create an optimization profile for a fixed shape model, but then kMIN==kOPT==kMAX has to equal the shape of the model (1, 3, 224, 224) in this case.

Generally, you have to export the model to ONNX with dynamic shape support from the original framework for correct results.

Trying to change a fixed shape ONNX model to dynamic shape may not always work correctly, for example, if any of the ops/layers had a hard-coded shape/parameter or something, that wouldn’t translate correctly when replacing a one of the dimensions with -1 manually.

This PyTorch tutorial shows how to export an ONNX model with dynamic shape: You could probably try to replace torchvision.models.alexnet with torchvision.models.mobilenet_v2 in the tutorial, and most other things are probably about the same.

I would also recommend using TensorRT 7 (and possibly building the Open Source Components from if still unable to parse the ONNX model) for this if possible to have the most up to date ONNX op support.

Refer to:

  1. creates a first network with dynamic input dimensions to act as a preprocessor for the model
  2. Parses your model that expects a fixed size input to create a second network
  3. Builds engines for both networks
  4. Runs the first engine to resize the input with dynamic dimensions of the first network to a size that the second network can consume
  5. Uses the output of the first engine as the input of the second engine to run your real model.

We have 2 questions:

  1. Does the second network must have a fix input?
  2. If the second network must have a fix input, all data must resize to the fix input size by the first engine.Does it affect the precison?

I have the same question, have you solved it?

@lipanpan1226 Hi, The solution has been given in post #3 of this thread.
If your onnx model is created from pytorch, you can also try this git-repo . Convert pytorch model to tensorrt directly.