Built engine failed to include optimization profile with dynamic input shapes #2166

nguyenvanhoang7398 · July 19, 2022, 1:59am

Description

I’m trying to build an engine from an Onnx network which inputs are of dynamic shapes. After creating an optimization profile and specifying the min, opt, and max shapes the built engine doesn’t incorporate this profile.

Environment

TensorRT Version: 8.4.1.5
GPU Type: V100
Nvidia Driver Version: 460
CUDA Version: 11.3
CUDNN Version:
Operating System + Version: Ubuntu 16.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.11
Baremetal or Container (if container which image + tag):

Relevant Files

Here is the Onnx model link. m2m100_418M-init-decoder.onnx - Google Drive

Steps To Reproduce

Below is the code to reproduce.

from pathlib import Path
import tensorrt as trt
from translate.quantization.tensorrt import common



_folder = Path.cwd()
saved_models_path = _folder.joinpath("tensorrt_models")
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)


def build_and_save_engine(onnx_file_path):
    builder = trt.Builder(TRT_LOGGER)
    network = builder.create_network(common.EXPLICIT_BATCH)
    parser = trt.OnnxParser(network, TRT_LOGGER)
    config = builder.create_builder_config()
    profile = builder.create_optimization_profile()
    with open(onnx_file_path, 'rb') as model:
        if not parser.parse(model.read()):
            print("ERROR: Failed to parse the ONNX file.")
            for error in range(parser.num_errors):
                print(parser.get_error(error))
            return None
        print('Completed parsing of ONNX file')

    print('Beginning ONNX file parsing')
    print("Network inputs:")
    for i in range(network.num_inputs):
        tensor = network.get_input(i)
        print(tensor.name, trt.nptype(tensor.dtype), tensor.shape)
    # allow TensorRT to use up to 16GB of GPU memory for tactic selection
    config.max_workspace_size = common.GiB(16)
    builder.max_batch_size = 1

    # set optimization profile for dynamic shape
    min_length, max_length = 1, 200
    opt_length = int(max_length / 2)
    min_batch_size, max_batch_size = 1, 16
    num_beams = 1
    dim = 1024

    profile.set_shape('input_ids', (min_batch_size * num_beams, 1),
                      (min_batch_size * num_beams, 1),
                      (min_batch_size * num_beams, 1))
    profile.set_shape('encoder_attention_mask', (min_batch_size * num_beams, min_length),
                      (min_batch_size * num_beams, opt_length),
                      (min_batch_size * num_beams, max_length))
    profile.set_shape('encoder_hidden_states', (min_batch_size * num_beams, min_length, dim),
                      (min_batch_size * num_beams, opt_length, dim),
                      (min_batch_size * num_beams, max_length, dim))

    config.add_optimization_profile(profile)

    engine = builder.build_engine(network, config)
    context = engine.create_execution_context()
    print(context.get_binding_shape(0))
    print(context.get_binding_shape(1))
    print(context.get_binding_shape(2))


if __name__ == "__main__":
    build_and_save_engine("m2m100_418M-init-decoder.onnx")

It’s supposed to print the shapes of 3 inputs as

(1, -1)
(1, -1)
(1, -1, 1024)

since the second dimension is dynamic but it prints

(1, 1)
(1, 1)
(1, 1, 1024)

Can someone please check if there’s any problem with my implementation or if it’s indeed a bug? Thank you.

NVES · July 19, 2022, 2:37am

Hi,
Please refer to below links related custom plugin implementation and sample:

While IPluginV2 and IPluginV2Ext interfaces are still supported for backward compatibility with TensorRT 5.1 and 6.0.x respectively, however, we recommend that you write new plugins or refactor existing ones to target the IPluginV2DynamicExt or IPluginV2IOExt interfaces instead.

Thanks!

nguyenvanhoang7398 · July 19, 2022, 2:43am

I’m sorry but how are these plugins supposed to solve the problem?

spolisetty · July 22, 2022, 9:03am

Hi,

Both input_ids and encoder_attention_mask has shapes (batch, seq_length) . So you can’t set input_ids to have fixed seqLen 1 while setting ``encoder_attention_mask` to have dynamic seqLen.

If the two dimensions are going to be different, they should have different names. Otherwise, TRT thinks they correspond to the same number.

Thank you.

Topic		Replies	Views
Failed to build engine caused by the dynamic input error TensorRT tensorrt	2	1092	November 21, 2022
TensorRT6 OnnxParser could not support dynamic shape. TensorRT	11	3251	November 8, 2019
[TensorRT] ERROR: input: dynamic input is missing dimensions in profile 0 TensorRT	11	7047	October 12, 2021
Why is it failed to convert onnx to engine in TensorRT 7.0.0.11? TensorRT	4	3039	February 20, 2020
Build engine from onnx model, can not set dynamic shape right TensorRT	5	911	July 24, 2023
Network has dynamic or shape inputs, but no optimization profile has been defined TensorRT tensorrt	6	862	April 29, 2023
TensorRT 7 ONNX models with variable batch size TensorRT kb	13	12037	October 12, 2021
Network has dynamic or shape inputs but no optimization profiles have been defined TensorRT tensorrt , cuda , onnx	6	5784	July 16, 2020
TensorRT Support for 5D input tensor TensorRT	9	1401	September 8, 2021
How to use different profile in tensorrt? TensorRT tensorrt , python	3	1390	July 19, 2022

Built engine failed to include optimization profile with dynamic input shapes #2166

Description

Environment

Relevant Files

Steps To Reproduce

Related topics