Multi inputs for ONNX models with batch_size=1

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.2
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only) 525
• Issue Type( questions, new requirements, bugs) questions

I have a question about running DeepStream with Triton Inference Server for an ONNX model with batch_size=1. I read the deepstream-test3 and then I found the line if(pgie_batch_size != number_sources). If I understand correctly, if multi sources are given, the model will run with batch_size > 1. However, if my ONNX model only supports batch_size=1, how will DeepStream handle it? For example, will it run inference sequentially each input source (batch_size=1) before run downstream components?

yes, deepstream nvinferserver plugin will send batch data to trtion server, and the batch number of data will not be more than nvinferser’s batch-size, nvinferserver is opensource in DeepStream6.2, you can check in GstNvInferServerImpl::processFullFrame.

nvinferserver will leverage triton to do model inference, triton will give error if received data’s batch-size is more than model 's max_batch_size,

What if nvinferserver’s max_batch_size: 1 and model config is:

platform: "onnxruntime_onnx"
max_batch_size : 0
input [
    name: "images"
    data_type: TYPE_FP32
    dims: [ 1, 3, 960, 960 ]

output [
    name: "outputs"
    data_type: TYPE_FP32
    dims: [ 1, 57375, 57 ]

while three streams are used as inputs. Then, the batch_size that model receives is 1 or 3?

if nvinferserver’s max_batch_size is 1, the batch_size that triton receives is 1.

but according to if(pgie_batch_size != number_sources), the code will override batch_size to 3 if three streams are used, is it correct? If that is the case, we will get error because three streams (batch_size=3) doesn’t match the batch size required by the model? Or is there anyway to run three streams with an ONNX model that only support batch_size=1?

you can set pgie’s batch-size to 1 if model only supports 1 batch.

Sorry if my question is not clear. My concern is streammux’s batch-size=number_sources and the code overrides pgie_batch_size=number_sources. If three streams are used, then number_sources=3. In this case, you said that you can set pgie’s batch-size to 1 if model only supports 1 batch, so streamux_batch_size=3 and pgie_batch_size=1 won’t match?
In other words, streammux will create a batch_size=3 while pgie only expects batch_size=1, then the code will crash, won’t it?

yes, if dose not match, nvinferserver will only send max-batch-size data every time, please refer to GstNvInferServerImpl::processFullFrame, espicialy this line:
for (uint32_t batchIdx = 0; i < numFilled && batchIdx < maxBatchSize();

Thank @fanzh, can you send me the link to GstNvInferServerImpl::processFullFrame which you mentioned?

@fanzh Another question is if my onnx model can support batch_size=3 and I run output parsing in python, how can I know which output comes from which streams inside the pgie_src_pad_buffer_probe so that I can run bounding boxes drawing and pose visualization for each stream and then send the visualization back to the right stream?

it is in deeptream SDK, deepstream
the code path is opt\nvidia\deepstream\deepstream\sources\gst-plugins\gst-nvinferserver\gstnvinferserver_impl.cpp

you can use NvDsFrameMeta source_id, please refer to this topic:How to get source's index in deepstream? - #3 by fanzh

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.