ONNX to TensorRT with dynamic batch size in Python

Description

Trying to convert the yolov3-tiny-416 model to TensorRT with a dynamic batch size, with code modified from https://github.com/jkjung-avt/tensorrt_demos/tree/master/yolo

The resulting engine is always None. Code snippets below.

Environment

Using the docker container nvcr.io/nvidia/tensorrt:20.08-py3
TensorRT Version: 7.1.3.4
GPU Type: Titan X
Nvidia Driver Version: 450.51.06
CUDA Version: 11.0
CUDNN Version:
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
ONNX Version: 1.4.1
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:20.08-py3

Relevant Files

Modified the build_engine function in https://github.com/jkjung-avt/tensorrt_demos/blob/master/yolo/onnx_to_tensorrt.py to

def build_engine(onnx_file_path, category_num=80, verbose=True):
"""Build a TensorRT engine from an ONNX file."""
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) # if verbose else trt.Logger()
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(*EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
    builder.max_workspace_size = 1 << 30
    # builder.max_batch_size = 32
    builder.fp16_mode = True
    #builder.strict_type_constraints = True
    
    config = builder.create_builder_config()

    # Parse model file
    print('Loading ONNX file from path {}...'.format(onnx_file_path))
    with open(onnx_file_path, 'rb') as model:
        if not parser.parse(model.read()):
            print('ERROR: Failed to parse the ONNX file.')
            for error in range(parser.num_errors):
                print(parser.get_error(error))
            return None
    
	shape = list(network.get_input(0).shape)
    shape[0] = -1
    network.get_input(0).shape = shape
    print(network.get_input(0).shape)

    print('Adding yolo_layer plugins...')
    model_name = onnx_file_path[:-5]
    network = add_yolo_plugins(
        network, model_name, category_num, TRT_LOGGER)
    
    
    profile = builder.create_optimization_profile()
    profile.set_shape(network.get_input(0).name, (1, 3, 416, 416), (16, 3, 416, 416), (32, 3, 416, 416))
    config.add_optimization_profile(profile)

    print('Building an engine.  This would take a while...')
    print('(Use "--verbose" to enable verbose logging.)')
    engine = builder.build_engine(network, config)
    print('Completed creating engine.')
    return engine

ONNX model: https://drive.google.com/file/d/1-WJCijVL9_JdEVVLOanzGvl9RmIyJWF_/view?usp=sharing
Model IR version: 4
Opset Version: 9

Steps To Reproduce

  • Run the above build_engine function with the ONNX model in the link, the resulting engine is always None

I am not sure I am missing something simple, or if there is a compatibility issue here. I thought adding the optimization profile would do the trick.

Any help is much appreciated.

Hi @aravind.anantha,
I tried generating engine using trtexec command , and it worked fine for me.
Can you try the below command once.
trtexec --onnx=yolov3-tiny-416.onnx --verbose --explicitBatch --shapes=000_net:1x3x416x416

Thanks!

Hi @AakankshaS I saved the engine this way, and loaded it back with the Python API to check it.
engine.get_binding_shape(0) (-1, 1, 224, 224)

But, when I see engine.max_batch_size, it is 1

I’m not sure if I need to change anything else to make it work.

This is the command I used.

trtexec --onnx=yolov3-tiny-416.onnx --explicitBatch --optShapes=000_net:16x3x416x416 --maxShapes=000_net:32x3x416x416 --minShapes=000_net:1x3x416x416 --shapes=000_net:8x3x416x416 --saveEngine=yolov3-tiny-416.trt

I realized the difference between execute_async() and execute_async_v2(). The latter ignores the engine batch size and is used for dynamic batches.

I was able to run a Python script with the engine generated using trtexec command from the above comment with different batch sizes

1 Like