Torchvision's Maxvit ONNX TensorRT - output different for batch size > 1

mkompanek · January 8, 2025, 2:20pm

Description

Hi, I am trying to run Maxvit model from torchvision with ONNX runtime and TensorRT (replaced incompatible einsum operations). Works well when batch size == 1, however when using batch size==32 outputs are vastly different (max abs error > 3 when compared to CPU inference). Is there anything I can do to lower these errors ?
Test code (export code in attachment):

import numpy as np
import onnxruntime

trt_provider = [
    ('TensorrtExecutionProvider', {
        'trt_profile_min_shapes': f'input:1x3x224x224',
        'trt_profile_opt_shapes': f'input:32x3x224x224',
        'trt_profile_max_shapes': f'input:32x3x224x224',
        'trt_engine_cache_enable': True,
        'trt_engine_cache_path': '.'
    })
]

cpu_inference = onnxruntime.InferenceSession('maxvit.onnx', providers=['CPUExecutionProvider'])
trt_inference = onnxruntime.InferenceSession('maxvit.onnx', providers=trt_provider)

batch = np.random.randn(32, 3, 224, 224).astype(np.float32)
first_item = batch[0][None]

batch_result_trt = trt_inference.run([], {'input': batch})[0]
batch_result_cpu = cpu_inference.run([], {'input': batch})[0]

first_item_result_trt = trt_inference.run([], {'input': first_item})[0]
first_item_result_cpu = cpu_inference.run([], {'input': first_item})[0]

print(f'Max abs error (Batch size = 1): {np.max(np.abs(first_item_result_trt - first_item_result_cpu))}')
print(f'Max abs error (TRT Batch size = 32): {np.max(np.abs(batch_result_trt[0][None] - first_item_result_cpu))}')
print(f'Max abs error (Batch size = 32): {np.max(np.abs(batch_result_trt - first_item_result_cpu))}')

Max abs error (Batch size = 1): 2.86102294921875e-06
Max abs error (TRT Batch size = 32): 3.64798903465271
Max abs error (Batch size = 32): 3.64798903465271

Environment

Ubuntu 20.04.6 LTS
Tesla T4 (Azure, NC4as_T4_v3)
Driver: 535.216.03
CUDA: 12.6
CUDNN: 9.6.0.74
Python: 3.12
TensorRT: 10.7.0.23

Relevant Files

export_and_test_script.zip (1.8 KB)

Steps To Reproduce

run 1_export.py from uploaded scripts - exports ‘maxvit.onnx’ file (with torch==2.5.1+cu124)
run 2_test.py from uploaded scripts - runs exported model and compares outputs (with onnxruntime-gpu==1.20.1)

AakankshaS · January 31, 2025, 11:12am

Hi @mkompanek ,
Can you please help us with the onnx model.

Thanks

mkompanek · January 31, 2025, 11:18am

Hi AakankshaS, uploaded here: Google Drive

mkompanek · March 13, 2025, 2:06pm

Hi @AakankshaS, can you access the model ?

Topic		Replies	Views
TensorRT 7 ONNX models with variable batch size TensorRT kb	13	12025	October 12, 2021
ONNX Model Int64 Weights TensorRT	12	13157	February 17, 2024
ONNX batchsize setting and buffer.h assert error TensorRT	3	1172	March 23, 2021
TensorRT runtime batch processing in C++ TensorRT tensorrt	5	1558	September 8, 2021
TensorRT engine produces incorrect results TensorRT tensorrt , tensorflow , onnx	10	1814	October 29, 2020
The default value of engine.max_batch_size is 32? TensorRT	4	1791	October 12, 2021
Converted model is broken if half precision with dynamic batch size and batch size is greater than 1 TensorRT	11	2369	October 18, 2024
Why I cannot change the BatchSize (index) dimension for a network imported from ONNX format in TRT7.0 TensorRT	5	5205	April 13, 2020
TensorRT's OnnxParser problem TensorRT tensorrt	6	2322	October 12, 2021
Creating a TensorRT Engine with different batch sizes TensorRT python , onnx	12	2780	August 18, 2020

Torchvision's Maxvit ONNX TensorRT - output different for batch size > 1

Description

Environment

Relevant Files

Steps To Reproduce

Related topics