• Hardware Platform (Jetson / GPU) Jetson AGX
• DeepStream Version 5.0.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 7.0
• NVIDIA GPU Driver Version (valid for GPU only)
I have a setup with a YOLOv3 model being in primary inference, with its detected objects passed on to an SSD model in secondary inference.
I need to process the secondary inference in parallel (or in batches), depending on how many objects are detected in the primary inference.
At this time, the SSD is in ONNX format with a fixed batch size of 1, and I’m wondering if setting batch-size=n in its config file will replicate the execution context across multiple cuda streams, or if I need to re-export the ONNX with the desired batch size and set the batch-size variable accordingly.