Creating a TensorRT Engine with different batch sizes

Description

I am using python to create a TensorRT Engine for ResNet 50 from Onnx Model. The input size is (-1, 224, 224, 3)
. I am using Python, I tried to replicate the provided code in C++ as all batching samples are C++ and there are some API differences.

The final code I have is:

EXPLICIT_BATCH = 1 << (int)(
    trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
with trt.Builder(TRT_LOGGER) as builder, builder.create_builder_config() as config, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
    
    with open(modelfile, 'rb') as model:
        if not parser.parse(model.read()):
            for error in range(parser.num_errors):
                print(parser.get_error(error))
    dims = network.get_input(0).shape
    ilayer = network.add_input("foo", trt.float32, (3, 224, 224))
    resizeLayer = network.add_resize(ilayer)
    resizeLayer.shape = dims
    network.mark_output(resizeLayer.get_output(0))
    profile = builder.create_optimization_profile()
    profile.set_shape("foo", (1, 3, 224, 224),
                       (5, 3, 224, 224),
                       (50, 3, 224, 224))
    
    config.add_optimization_profile(profile)
    with builder.build_engine(network, config) as engine:
        return engine

It gave me errors:

[TensorRT] ERROR: (Unnamed Layer* 138) [Resize]: mismatch in number of dimensions for outputDims.
[TensorRT] ERROR: (Unnamed Layer* 138) [Resize]: mismatch in number of dimensions for outputDims.
[TensorRT] ERROR: Layer (Unnamed Layer* 138) [Resize] failed validation
[TensorRT] ERROR: Network validation failed.
Traceback (most recent call last):
  File "convert.py", line 62, in <module>
    main(sys.argv[1:])
  File "convert.py", line 57, in main
    engine = build_engine(inputfile)
  File "convert.py", line 32, in build_engine
    with builder.build_engine(network, config) as engine:
AttributeError: __enter__

Environment

TensorRT Version: 7.0.0.11
GPU Type: Nvida 108
Nvidia Driver Version: 440.64.00
CUDA Version: 10.2
CUDNN Version: V10.2.89
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi @whanafy,
Can you please share your onnx model so that i can try reproducing the issue at my end?

Thanks!

This is the used onnx Model.

Hi @whanafy,
I tried running your model with input shape ** unk__612,224,224,3**, as is shown in netron, and i was able to run it successfully!
Can you please try running the below command.
trtexec --onnx=ResNet50-d.onnx --verbose --explicitBatch --shapes=input_1:0:1612x224x224x3 --workspace=3000
Thanks!

My doubt was not in the model. It was in the conversion Code. I think I will use trtexec for the conversion. However, I have a question. Doesn’t the conversion you provided mean that I cannot change the batch size dynamically?

Also when running the command it shows me this message:

Dynamic dimensions required for input: input_1:0, but no shapes were provided. Automatically overriding shape to: 1x224x224x3

Doesn’t this mean that the model is no longer dynamic? I tried to run it but it only works with batch of 1.

Hi @whanafy,
You can change the batch size dynamically. Please refer to the below link.
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#work_dynamic_shapes

When you do not provide any input shapes at all, then the default shape is picked which is 1x224x224x3.
Thanks!

I have tried running with dynamic batches :

trtexec --explicitBatch --onnx=ResNet50.onnx --minShapes=input:1x224x224x3 --maxShapes=input:32x224x224x3 --optShapes=input:8x224x224x3 --shapes=input:32x3x224x224 --worskpace=500 --saveEngine=ResNet50.engine

It gives the same error and only works with a batch of one (which the message says its will do for both the shapes of the test and the network itself)

Also for completeness:
I am using this to create the onnx model. First download from Keras:

import tensorflow as tf
import os

model_name = "ResNet50"
model = tf.keras.applications.ResNet50(
    include_top=True,
    weights="imagenet")

model.summary()
os.mkdir("/"+model_name)
os.mkdir("/"+model_name+"/saved_model")
model.save("/"+model_name+'/saved_model')

Then use :

python -m tf2onnx.convert --saved-model ResNet50/saved_model --output ResNet50/ResNet50.onnx

to convert it to onnx.

Hi @whanafy
Please replace the input name in your command as per the name given in your code.

Please try the below command
trtexec --onnx=ResNet50-d.onnx --verbose --explicitBatch --minShapes=input_1:0:1x224x224x3 --maxShapes=input_1:0:32x224x224x3 --optShapes=input_1:0:8x224x224x3 --shapes=input_1:0:32x224x224x3

Thanks!

It gave the same warning message:

[07/30/2020-04:49:29] [W] Dynamic dimensions required for input: input_1:0, but no shapes were provided. Automatically overriding shape to: 1x224x224x3

and it only works with batch of one.

Hi @whanafy,
Can you check if you have shared the same model?
Also if you can share --verbose logs, as i am not able to reproduce the issue and it is working for other batch size as well.
Thanks!

Yes, I have re-downloaded it to make sure it is doing the same issue.

The logs are attached here.

The error is shown in line 1247 and line 5516.

Hi @whanafy,
The logs says that you were able to convert your model successfully.
The warnings you are getting are related to the input shapes.
You can cross validate the input shapes from netron.
However, i tried running your command, and it worked fine without the warnings.
Request you to try the same on latest TRT release.
Thanks!