Batch-size with 9 rtsp streams

Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) GPU (2080 ti)
• DeepStream Version 5.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 7.2.1
• NVIDIA GPU Driver Version (valid for GPU only) 450.102.04
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

batch-size=9 in [streamux] group in config_file.txt with 9 rtsp sources.

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi,
I have few observations to share for my custom Retinaface model with deepstream5 and 9 rtsp video sources.

  1. I used trtexec for onnx to .engine file conversion with “–explicitBatch” parameter as directly loading onnx to deepstream was giving me error for “EXPLICIT_BATCH (!_importer_ctx.network()->hasImplicitBatchDimension())”.

I tested with “batch-size=5 in [streamux] group and batch-size=1 in [pgie] group as onnx model was created with batch-size=1”…
The model ran fine and all cameras were set to 20 fps and performance of deepstream was close to 18fps… .
Is it fine to put batch-size as mentioned?

  1. I found that “batch-size” parameter in [streamux] group are needed to be set according to number of sources. So i set that to 9 and then deepstream hangs when i start and performance gives me 0 value for all the cameras.
    Also if i set “batch-size=1”, then it runs but with 6fps of speed.
    Not sure why.

  2. I converted onnx model with batch-size=9 and did trtexec again to build the engine file like-
    “trtexec --batch=9 --onnx=onnx-model --saveEngine=output.engine”

And then tried with "batch-size=9 " in both [pgie] and [streamux] group but this time there was error-
WARNING: nvdsinfer_backend.cpp:162 Backend context bufferIdx(0) request dims:1x3x640x640 is out of range, [min: 10x3x640x640, max: 10x3x640x640]
**ERROR: nvdsinfer_backend.cpp:425 Failed to enqueue buffer in fulldims mode because binding idx: 0 with batchDims: 1x3x640x640 is not supported **
ERROR: nvdsinfer_context_impl.cpp:1532 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_INVALID_PARAMS
0:00:05.307619290 18160 0x559e34047f70 WARN nvinfer gstnvinfer.cpp:1216:gst_nvinfer_input_queue_loop:<primary_gie> error: Failed to queue input batch for inferencing
ERROR from primary_gie: Failed to queue input batch for inferencing

Any suggestions on all above observations? or i can go with batch-size=5 but this feels like a decent escape rather than solving.

I have seen many posts regarding batch-size but not cleared for my case.
Thanks.

  1. You are using onnx model, right? Does your onnx model support dynamic batch?
  2. Yes. “batch-size” parameter in [streamux] group are needed to be set according to number of sources. And you are using rtsp source, please set “live-source=1” in [streammux] group too. What is the FPS of your rtsp sources? You also need to set “batched-push-timeout=40000” in [streammux] group. Set “sync=0” in [sink] group.
  3. streammux has nothing to do with your model batch size, you just need to set nvinfer batch-size according to your model. Does your onnx model support dynamic batch? If so, you can try “trtexec --maxBatch=9 --onnx=onnx-model --saveEngine=output.engine”

Hi,
Thanks for the reply.

1. You are using onnx model, right? Does your onnx model support dynamic batch?
Yes. But for that in need to disable --explicitBatch while using trtexec for engine file generation.

2. Yes. “batch-size” parameter in [streamux] group are needed to be set according to number of sources. And you are using rtsp source, please set “live-source=1” in [streammux] group too. What is the FPS of your rtsp sources? You also need to set “batched-push-timeout=40000” in [streammux] group. Set “sync=0” in [sink] group.
I did setup all the options- live-source, batch-push-timeout, and sync.

3. streammux has nothing to do with your model batch size, you just need to set nvinfer batch-size according to your model. Does your onnx model support dynamic batch? If so, you can try “trtexec --maxBatch=9 --onnx=onnx-model --saveEngine=output.engine”
I did the same "trtexec --maxBatch=9 --onnx=onnx-model --saveEngine=output.engine” But have issue-

0:00:00.824015848 25896 0x55b4e434fc40 WARN nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1642> [UID = 1]: Backend has maxBatchSize 1 whereas 10 has been requested
0:00:00.824039004 25896 0x55b4e434fc40 WARN nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1813> [UID = 1]: deserialized backend context :/home/rahul/DS/launchconfigs/facedetector/output.engine failed to match config params, trying rebuild
0:00:00.826536362 25896 0x55b4e434fc40 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files

Input filename: /home/rahul/DS/launchconfigs/facedetector/mnet_025_mean_fixed_640_op11_simplified.onnx
ONNX IR version: 0.0.6
Opset version: 11
Producer name: pytorch
Producer version: 1.7
Domain:
Model version: 0
Doc string:

WARNING: …/nvdsinfer/nvdsinfer_func_utils.cpp:36 [TRT]: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: …/nvdsinfer/nvdsinfer_func_utils.cpp:36 [TRT]: TensorRT was linked against cuDNN 8.0.4 but loaded cuDNN 8.0.2
INFO: …/nvdsinfer/nvdsinfer_func_utils.cpp:39 [TRT]: Detected 1 inputs and 9 output network tensors.
WARNING: …/nvdsinfer/nvdsinfer_func_utils.cpp:36 [TRT]: TensorRT was linked against cuDNN 8.0.4 but loaded cuDNN 8.0.2
0:00:15.363621720 25896 0x55b4e434fc40 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1748> [UID = 1]: serialize cuda engine to file: /home/rahul/DS/launchconfigs/facedetector/mnet_025_mean_fixed_640_op11_simplified.onnx_b10_gpu0_fp32.engine successfully
WARNING: …/nvdsinfer/nvdsinfer_func_utils.cpp:36 [TRT]: TensorRT was linked against cuDNN 8.0.4 but loaded cuDNN 8.0.2
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:685 [FullDims Engine Info]: layers num: 4
0 INPUT kFLOAT input0 3x640x640 min: 1x3x640x640 opt: 10x3x640x640 Max: 10x3x640x640
1 OUTPUT kFLOAT locs -1x4 min: 0 opt: 0 Max: 0
2 OUTPUT kFLOAT landmarks -1x10 min: 0 opt: 0 Max: 0
3 OUTPUT kFLOAT scores -1x2 min: 0 opt: 0 Max: 0

Can you use netron to check your model’s input layer information and show the information to us?

Hi,
INPUTS
input0 : name:input0
type: float32[1,3,640,640]

OUTPUTS

name: locs

type: float32[1,16800,4]

name: scores

type: float32[1,16800,2]

name: landmarks

type: float32[1,16800,10]

Your model does not supports dynamic batch.

It only supports explicit batch size and the batch size is 1.
Your nvinfer batch-size can only be 1

Ok.

To make my model dynamic batch support, i am following this-

import onnx

def change_input_output_dim(model):
batch_dim = 0

inputs = model.graph.input
for input in inputs:
    dim1 = input.type.tensor_type.shape.dim[0]
    dim1.dim_value = batch_dim

outputs = model.graph.output
for output in outputs:
    dim2 = output.type.tensor_type.shape.dim[0]
    dim2.dim_value = batch_dim

def apply(transform, infile, outfile):
model = onnx.load(infile)
transform(model)
onnx.save(model, outfile)

apply(change_input_output_dim, r"old.onnx", r"updated.onnx")

Is it correct? or i need to do something else?

If you are using Pytorch to export model, please refre to Pytorch document.
For TensorRT, please refer to TensorRT/ONNX - eLinux.org

Hi i have few questions-

if i use --explicitBatch instead of --maxBatch=9, that also means that batch-size can be assign externally…?

I converted the model to support dynamic batch and Netron output is -

INPUTS
name: input0

type: float32[-1,3,640,640]

OUTPUTS

name: locs

type: float32[-1,16800,4]

name: scores

type: float32[-1,16800,2]

name: landmarks

type: float32[-1,16800,10]

Is it fine to be used as dynamic batch model? May be i am misunderstanding dynamic batching.

If still it is not dynamic batch supported, then i have another model-
whose netron output is -

data
name: data
type: float32[1,3,640,640]
face_rpn_cls_prob_reshape_stride32
name: face_rpn_cls_prob_reshape_stride32
type: float32[1,4,20,20]
face_rpn_bbox_pred_stride32
name: face_rpn_bbox_pred_stride32
type: float32[1,8,20,20]
face_rpn_landmark_pred_stride32
name: face_rpn_landmark_pred_stride32
type: float32[1,20,20,20]
face_rpn_cls_prob_reshape_stride16
name: face_rpn_cls_prob_reshape_stride16
type: float32[1,4,40,40]
face_rpn_bbox_pred_stride16
name: face_rpn_bbox_pred_stride16
type: float32[1,8,40,40]
face_rpn_landmark_pred_stride16
name: face_rpn_landmark_pred_stride16
type: float32[1,20,40,40]
face_rpn_cls_prob_reshape_stride8
name: face_rpn_cls_prob_reshape_stride8
type: float32[1,4,80,80]
face_rpn_bbox_pred_stride8
name: face_rpn_bbox_pred_stride8
type: float32[1,8,80,80]
face_rpn_landmark_pred_stride8
name: face_rpn_landmark_pred_stride8
type: float32[1,20,80,80]

Can i use this as dynamic batch supported model?

one more thing.

If engine file is created with trtexec --explicitBatch, then it should not matter with getMaxBatchsize… But still it shows getMaxBatchsize = 1.

Hi @sharma.rahul98912 ,
What’s the error with command
$ trtexec --maxBatch=9 --onnx=onnx-model --saveEngine=output.engine

Thanks!

Hi,
There is no error while conversion with mentioned command.
Whenever i run deepstream app with generated model with following params-
batch-size in streamux group = 9 (9 cameras)
batch-size in pgie group = 9 (as i think with maxBatch=9, i can assign batch=9)
batch-size in property group in “config_infer_primary.txt” = 9.

Then it shows runtime error-

0:00:00.764959542 15224 0x55d15dabab60 WARN nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1642> [UID = 1]: Backend has maxBatchSize 1 whereas 9 has been requested
0:00:00.764967237 15224 0x55d15dabab60 WARN nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1813> [UID = 1]: deserialized backend context :/home/rahul/DS/launchconfigs/facedetector/new.engine failed to match config params, trying rebuild
0:00:00.766699710 15224 0x55d15dabab60 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1715> [UID = 1]: Trying to create engine from model files
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:934 failed to build network since there is no model file matched.
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:872 failed to build network.
0:00:00.766973424 15224 0x55d15dabab60 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
0:00:00.766998686 15224 0x55d15dabab60 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1821> [UID = 1]: build backend context failed
0:00:00.767005948 15224 0x55d15dabab60 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1148> [UID = 1]: generate backend failed, check config file settings
0:00:00.767031538 15224 0x55d15dabab60 WARN nvinfer gstnvinfer.cpp:809:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:00.767037169 15224 0x55d15dabab60 WARN nvinfer gstnvinfer.cpp:809:gst_nvinfer_start:<primary_gie> error: Config file path: /home/rahul/DS/launchconfigs/config_infer_primary_retinaface.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: main:660: Failed to set pipeline to PAUSED
Quitting

could you share your nvinfer DS config file?

It is going time consuming. I will update my points again. Please correct if i am wrong.

My model should be dynamic batch supported to be used with more batch size and then "trtexec --maxbatch=9 "command will make expected model. I am working to make the model dynamic batch support.
If i see something like 1x256x1x6 next to the input node, then it’s fixed. If i see something like x256x1x6 then it’s dynamic. Is this right?

  1. Another method is that i can directly load onnx model to deepstream config_infer_primary.txt and deepstream internally convert that onnx to .engine file.

Then it showed error-

ERROR: ModelImporter.cpp:474 In function importModel:
** Assertion failed: !_importer_ctx.network()->hasImplicitBatchDimension() && “This version of the ONNX parser only supports TensorRT INetworkDefinitions with an explicit batch dimension. Please ensure the network was created using the EXPLICIT_BATCH NetworkDefinitionCreationFlag.”**

So my doubt is where can i find the required file where i can put this-
(1) const auto explicitBatch = 1U << static_cast<uint32_t>(nvinfer1::NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
(2) auto network = GhUniquePtr(builder->createNetworkV2(explicitBatch));

Hi ,
one update-

I have updated my model to support dynamic batch-

Netron shows-

INPUTS

name: input0

type: float32[N,3,640,640]

OUTPUTS

name: locs

type: float32[N,16800,4]

name: scores

type: float32[N,16800,2]

name: landmarks

type: float32[N,16800,10]

Please let me know if this is fine and i can use "trtexec --maxBatch=9 --onnx=onnx-model --saveEngine=output.engine "?

Hi closing the issue as i got it working with command-
trtexec --explicitBatch
–shapes=input0:6x3x640x640
–optShapes=input0:2x3x640x640
–minShapes=input0:1x3x640x640
–maxShapes=input0:12x3x640x640
–onnx=mnet_025_mean_fixed_640_op11_dyn_simplified.onnx
–saveEngine=mnet_025_mean_fixed_640_op11_dyn_simplified.engine

Thanks for the support and important information.