ONNX batchsize setting and buffer.h assert error

Description

Hi,
I am trying to run onnx inference with batchsize = 10 , having successfully run with batchsize = 1 and get the output result.
I have changed code below:

builder->setMaxBatchSize(mParams.batchSize); // mParams.batchSize was set to 10 in initializeSampleParams

And in infer() function:

samplesCommon::BufferManager buffers(mEngine,10);

In order to build this onnx model

const auto explicitBatch = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
auto network = SampleUniquePtrnvinfer1::INetworkDefinition(builder->createNetworkV2(explicitBatch));

These two lines was added in to build() funcion.
After compile and run, it gives this error:

[03/17/2021-16:49:41] [I] [TRT] Detected 1 inputs and 1 output network tensors.
10 is current batch size
(1, 3, 192, 256) (1, 8, 192, 256)
sample_segnet: …/common/buffers.h:250: samplesCommon::BufferManager::BufferManager(std::shared_ptrnvinfer1::ICudaEngine, int, const nvinfer1::IExecutionContext*): Assertion `engine->hasImplicitBatchDimension() || mBatchSize == 0’ failed.
Aborted (core dumped)

So I tried setting explicitBatch to 10 and run again, it gives

&&&& RUNNING TensorRT.sample_onnx_mnist # ./sample_segnet
[03/17/2021-16:47:15] [I] Building and running a GPU inference engine for Onnx MNIST
[03/17/2021-16:47:16] [E] [TRT] Parameter check failed at: …/builder/builder.cpp::createNetworkV2::70, condition: flagSet.toU32() < (1U << EnumMax())
&&&& FAILED TensorRT.sample_onnx_mnist # ./sample_segnet

How to run inference with >= 2 batch in this case?

Thank you.

Environment

TensorRT Version : 7.1.3
GPU Type : Xavier
Nvidia Driver Version : Package:nvidia-jetpack, Version: 4.5.0
CUDA Version : 10.2.89
CUDNN Version : 8.0.0
Operating System + Version : Ubuntu 18.04
Python Version (if applicable) :
TensorFlow Version (if applicable) :
PyTorch Version (if applicable) :
Baremetal or Container (if container which image + tag) :

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Thanks for reply,

I have run

./trtexec --onnx=…/samples/sampleSEGNET/APA_FS_Model.onnx --verbose --batch=32

It passed successfully.
Here’s the verbose log but I delete some layer info.
verlog.txt (22.3 KB)

Could you give a example to run batch > 1 about sample_ONNXMINST?

Thank you.

Hi @disculus2012 ,

You can use optimization profiles, please refer below link for more details:
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-700/tensorrt-developer-guide/index.html#opt_profiles
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-700/tensorrt-developer-guide/index.html#work_dynamic_shapes

Using trtexec command,

Thank you.