How to create a multi-batch secondary model when the number of primary instances is unknown?

srsjd · January 19, 2022, 10:46am

I have a primary-gie and a secondary-gie for inference so when multiple primary instances were found, the same amount of secondary inference were triggered. As a result, the secondary-gie took significantly more time than the primary-gie even though the secondary-gie utilizes a smaller model.

I tried creating a secondary model with batch-size 2 but it ended up producing very bad results. So how can I improve the latency of secondary-gie?

My setup is the following:

Jetson Xavier
DeepStream 5.0
JetPack 4.4
TensorRT 7.1.3
NVIDIA GPU Driver Version 10.2

NVES · January 19, 2022, 11:08am

Hi,
The below link might be useful for you
https://docs.nvidia.com/deeplearning/tensorrt/best-practices/index.html#thread-safety

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html
For multi threading/streaming, will suggest you to use Deepstream or TRITON
For more details, we recommend you to raise the query to the Deepstream or TRITON forum.

Thanks!

srsjd · January 19, 2022, 12:11pm

thanks! I’ll raise it in the deepstream forum