Creating a TensorRT Engine with different batch sizes

Description

I am using python to create a TensorRT Engine for ResNet 50 from Onnx Model. The input size is (-1, 224, 224, 3)
. I am using Python, I tried to replicate the provided code in C++ as all batching samples are C++ and there are some API differences.

The final code I have is:

EXPLICIT_BATCH = 1 << (int)(
    trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
with trt.Builder(TRT_LOGGER) as builder, builder.create_builder_config() as config, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
    
    with open(modelfile, 'rb') as model:
        if not parser.parse(model.read()):
            for error in range(parser.num_errors):
                print(parser.get_error(error))
    dims = network.get_input(0).shape
    ilayer = network.add_input("foo", trt.float32, (3, 224, 224))
    resizeLayer = network.add_resize(ilayer)
    resizeLayer.shape = dims
    network.mark_output(resizeLayer.get_output(0))
    profile = builder.create_optimization_profile()
    profile.set_shape("foo", (1, 3, 224, 224),
                       (5, 3, 224, 224),
                       (50, 3, 224, 224))
    
    config.add_optimization_profile(profile)
    with builder.build_engine(network, config) as engine:
        return engine

It gave me errors:

[TensorRT] ERROR: (Unnamed Layer* 138) [Resize]: mismatch in number of dimensions for outputDims.
[TensorRT] ERROR: (Unnamed Layer* 138) [Resize]: mismatch in number of dimensions for outputDims.
[TensorRT] ERROR: Layer (Unnamed Layer* 138) [Resize] failed validation
[TensorRT] ERROR: Network validation failed.
Traceback (most recent call last):
  File "convert.py", line 62, in <module>
    main(sys.argv[1:])
  File "convert.py", line 57, in main
    engine = build_engine(inputfile)
  File "convert.py", line 32, in build_engine
    with builder.build_engine(network, config) as engine:
AttributeError: __enter__

Environment

TensorRT Version: 7.0.0.11
GPU Type: Nvida 108
Nvidia Driver Version: 440.64.00
CUDA Version: 10.2
CUDNN Version: V10.2.89
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi @whanafy,
Can you please share your onnx model so that i can try reproducing the issue at my end?

Thanks!

This is the used onnx Model.

Hi @whanafy,
I tried running your model with input shape ** unk__612,224,224,3**, as is shown in netron, and i was able to run it successfully!
Can you please try running the below command.
trtexec --onnx=ResNet50-d.onnx --verbose --explicitBatch --shapes=input_1:0:1612x224x224x3 --workspace=3000
Thanks!

My doubt was not in the model. It was in the conversion Code. I think I will use trtexec for the conversion. However, I have a question. Doesn’t the conversion you provided mean that I cannot change the batch size dynamically?

Also when running the command it shows me this message:

Dynamic dimensions required for input: input_1:0, but no shapes were provided. Automatically overriding shape to: 1x224x224x3

Doesn’t this mean that the model is no longer dynamic? I tried to run it but it only works with batch of 1.

Hi @whanafy,
You can change the batch size dynamically. Please refer to the below link.
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#work_dynamic_shapes

When you do not provide any input shapes at all, then the default shape is picked which is 1x224x224x3.
Thanks!

I have tried running with dynamic batches :

trtexec --explicitBatch --onnx=ResNet50.onnx --minShapes=input:1x224x224x3 --maxShapes=input:32x224x224x3 --optShapes=input:8x224x224x3 --shapes=input:32x3x224x224 --worskpace=500 --saveEngine=ResNet50.engine

It gives the same error and only works with a batch of one (which the message says its will do for both the shapes of the test and the network itself)

Also for completeness:
I am using this to create the onnx model. First download from Keras:

import tensorflow as tf
import os

model_name = "ResNet50"
model = tf.keras.applications.ResNet50(
    include_top=True,
    weights="imagenet")

model.summary()
os.mkdir("/"+model_name)
os.mkdir("/"+model_name+"/saved_model")
model.save("/"+model_name+'/saved_model')

Then use :

python -m tf2onnx.convert --saved-model ResNet50/saved_model --output ResNet50/ResNet50.onnx

to convert it to onnx.

Hi @whanafy
Please replace the input name in your command as per the name given in your code.

Please try the below command
trtexec --onnx=ResNet50-d.onnx --verbose --explicitBatch --minShapes=input_1:0:1x224x224x3 --maxShapes=input_1:0:32x224x224x3 --optShapes=input_1:0:8x224x224x3 --shapes=input_1:0:32x224x224x3

Thanks!

It gave the same warning message:

[07/30/2020-04:49:29] [W] Dynamic dimensions required for input: input_1:0, but no shapes were provided. Automatically overriding shape to: 1x224x224x3

and it only works with batch of one.

Hi @whanafy,
Can you check if you have shared the same model?
Also if you can share --verbose logs, as i am not able to reproduce the issue and it is working for other batch size as well.
Thanks!

Yes, I have re-downloaded it to make sure it is doing the same issue.

The logs are attached here.

The error is shown in line 1247 and line 5516.