Not able to perform batch inference on onnx model converted using tensorRT_optimization tool

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.10816
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Hi,

I want to perform batch inference on segmentation model. I converted the onnx model using tensorRT_optimization tool, where I passed batchSize=5 flag but tensorRT_optimization tool still uses batchSize=1 (default).
I also get a warning as shown below:

(segENV) dummy:~/SegModel$ /usr/local/driveworks-5.6/tools/dnn/tensorRT_optimization --modelType=onnx --onnxFile=/home/dummy/SegModel/segmodel.onnx --batchSize=5 --out=/home/dummy/dnn/src/segmodel_trt.bin --explicitBatch=1
[10-11-2022 12:13:47] DNNGenerator: Initializing TensorRT generation on model /home/dummy/SegModel/segmodel.onnx.
[10-11-2022 12:13:47] onnx2trt_utils.cpp:367: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[10-11-2022 12:13:47] DNNGenerator: Input "input": 3x480x640
[10-11-2022 12:13:47] DNNGenerator: Output "output": 26x480x640
[10-11-2022 12:14:36] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[10-11-2022 12:14:36] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[10-11-2022 12:14:36] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[10-11-2022 12:14:36] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[10-11-2022 12:14:36] DNNValidator: Iteration 0: 6.573056 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 1: 6.563840 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 2: 7.925760 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 3: 6.976512 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 4: 6.606848 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 5: 6.608896 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 6: 6.609920 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 7: 6.607872 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 8: 6.903808 ms.
[10-11-2022 12:14:36] DNNValidator: Iteration 9: 8.797184 ms.
[10-11-2022 12:14:36] DNNValidator: Average over 10 runs is 7.017370 ms.
[10-11-2022 12:14:37] Releasing Driveworks SDK Context

I have also created the onnx model for batched inference specifically. The input and output dimensions of the model are shared below. tensorRT_optimization tool only picks up CHW from the NCHW dimension present in onnx model.

Batched Input:
onnx_input_batched

Batched Output:
onnx_output_batched

Is this the current limitation of the tensorRT_optimizationtool? Looking forward to hearing back on this soon.

Dear @priyam1,
Could you share your ONNX model

Hi @SivaRamaKrishnaNV,

This is my segmentation model:
segmentation.onnx (15.8 MB)

@SivaRamaKrishnaNV Any updates on this issue?

I am not able to give batch dimension to any model. I even tried out yolov5.

But, I was able to add batch dimension when using standalone tensorRT using trtexec.

Dear @priyam1,
For ONNX models, explicitBatch is set by default. When explicitBatch is set, batchSize param has no impact. How about making dynamic size model and create engine.

Please share the trtexec used params for your model for engine generation and inference.