Batch-size vs streams in GST-NvInfer?

• Hardware Platform (Jetson / GPU) Jetson AGX
• DeepStream Version 5.0.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 7.0
• NVIDIA GPU Driver Version (valid for GPU only)

Hi,

I have a setup with a YOLOv3 model being in primary inference, with its detected objects passed on to an SSD model in secondary inference.
I need to process the secondary inference in parallel (or in batches), depending on how many objects are detected in the primary inference.
At this time, the SSD is in ONNX format with a fixed batch size of 1, and I’m wondering if setting batch-size=n in its config file will replicate the execution context across multiple cuda streams, or if I need to re-export the ONNX with the desired batch size and set the batch-size variable accordingly.

Hi @roulbac,

I’m wondering if setting batch-size=n in its config file will replicate the execution context across multiple cuda streams

No. It should fail in TRT engine build stage because you try to build a TRT engine with a different batch size.

if I need to re-export the ONNX with the desired batch size and set the batch-size variable accordingly.
yes, this is ok for Implicit batch.
or, you can set the batch size to dynamic when exporting ONNX, and then set

  1. force-implicit-batch-dim=0
  2. batch-size

then the TRT engine it build could support the input batch from 1 to batch-size

1 Like