Trtexec ignores inputIOFormat with onnx model

shshao · August 10, 2023, 12:38am

Description

I have a channel last TF model, and I convert it to onnx → trt. When invoking trtexec, even if I set --inputIOFormats=fp32:hwc, the input is still handled as channel first, and a pair of transposes (from channel last to channel first, then from channel first to channel last) are added. I wonder how I can get rid of these transposes to get better performance?

Environment

TensorRT Version: 8.5.1
GPU Type: RTX4000
Nvidia Driver Version: 525
CUDA Version: 11.8
CUDNN Version: 8.6
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable): 2.12
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Steps To Reproduce

Run this Python file

import os

import tensorflow as tf
import numpy as np

SAVED_MODEL_DIR = "/tmp/resnet50"
ONNX_MODEL_PATH = SAVED_MODEL_DIR + ".onnx"

class ResNet50(tf.Module):
    def __init__(self):
        super().__init__()
        self.model = tf.keras.applications.resnet50.ResNet50(
            weights="imagenet",
            include_top=True
        )

    @tf.function
    def forward(self, inputs):
        return self.model(inputs, training=False)

resnet = ResNet50()
input_batch = np.float32(np.random.rand(1, 224, 224, 3))
print("tf result", resnet.forward(input_batch)[0, 0])

# Save saved model.
tensor_specs = [tf.TensorSpec((1, 224, 224, 3), tf.float32)]
call_signature = resnet.forward.get_concrete_function(*tensor_specs)

os.makedirs(SAVED_MODEL_DIR, exist_ok=True)
print(f"Saving {SAVED_MODEL_DIR} with call signature: {call_signature}")
tf.saved_model.save(resnet, SAVED_MODEL_DIR,
                    signatures={"serving_default": call_signature})

# convert to onnx
assert os.system(
    f"python -m tf2onnx.convert --saved-model {SAVED_MODEL_DIR} --output {ONNX_MODEL_PATH}") == 0

# convert to trt
assert os.system(f"trtexec --onnx={ONNX_MODEL_PATH} --verbose --inputIOFormats=fp32:hwc") == 0

From the log, we can see transpose is added

StatefulPartitionedCall/resnet50/conv1_conv/Conv2D__6 [Transpose] inputs: [inputs -> (1, 224, 224, 3)[FLOAT]],

Remove --inputIOFormats=fp32:hwc and rerun, you can get an exactly the same engine, which means it doesn’t take effect.

AakankshaS · August 10, 2023, 10:37am

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

shshao · August 10, 2023, 3:38pm

Thanks! The model can be run with no problem. It just has redundant transpose/reformat.

I have shared both the onnx model and the verbose logs in https://drive.google.com/drive/folders/1lS0N2QuGY2UmC4sXDhZgPnG7jIlYwHAq?usp=drive_link, please take a look.

spolisetty · August 11, 2023, 10:34am

Hi,

Could you please try on the latest TensorRT version 8.6.1 and let us know if you still face the same issue.

Thank you.

shshao · August 14, 2023, 8:25pm

I upgraded to 8.6.1 and retried, can still see the same issue

[08/14/2023-13:22:27] [V] [TRT] Searching for input: inputs
[08/14/2023-13:22:27] [V] [TRT] StatefulPartitionedCall/resnet50/conv1_conv/Conv2D__6 [Transpose] inputs: [inputs -> (1, 224, 224, 3)[FLOAT]],

shshao · September 18, 2023, 7:43pm

Friendly ping

spolisetty · September 29, 2023, 9:22am

TRT is using the best global schedule it can find, and that may involve introducing transposes. As a result, the assumption that eliminating transposes will improve performance is incorrect.

shshao · September 29, 2023, 5:06pm

But when I visualize the graph, I got this

We can see TRT first shuffles NHWC to NCHW (Not sure if this is a bug in visualization) , then reformats it back to NHWC4. We should at least fuse Shuffle and Reformat.

spolisetty · October 17, 2023, 5:20pm

Thank you for reporting it. It is missing optimization, and we’ll continue to work on it.

shshao · October 17, 2023, 5:41pm

This is nice! Thank you!

solarflarefx · August 26, 2024, 7:49pm

@shshao Were you able to solve your problem?

trild-vietnam · November 19, 2024, 11:21am

Hello I face same issue,

any update so far.
Thank you so much

Topic		Replies	Views
Does TensorRT rewrite ONNX models to NHWC? TensorRT	11	1749	August 3, 2023
Trtexec can not convert resnet152 onnx to TRT engine, without prompting error! TensorRT	12	1519	July 22, 2021
Trtexec onnx inception_v3 TensorRT	8	1263	March 3, 2021
TensorRT model giving constant output TensorRT deepstream	4	1339	November 30, 2021
TensorRT/quickstart/IntroNotebooks run trtexec have error TensorRT	1	854	October 23, 2021
Trtexec failed to create an engine from onnx file with fp16 TensorRT	7	1196	July 8, 2022
Onnx to tensorrt conversion fails TensorRT	6	1938	May 5, 2022
Converting onnx to trt: [8] No importer registered for op: OneHot TensorRT tensorrt	3	2897	February 2, 2021
Problem converting TensorFlow 2-> ONNX model to TensorRT Engine (efficientdet_d0) TensorRT	8	1377	November 17, 2022
ONNX -> TRT Error Code 1 and 2: Cask (isConsistent) and Internal Error (Assertion enginePtr != nullptr failed.) TensorRT tensorrt	5	921	July 26, 2022

Trtexec ignores inputIOFormat with onnx model

Description

Environment

Steps To Reproduce

check_model.py

Related topics