Dynamic batching for tensorrt engine model

• Hardware Platform (Jetson / GPU) Jetson TX2
• DeepStream Version 5.1
• JetPack Version (valid for Jetson only) 4.5

Reproduce:

Thanks for your TLT 3.0 toolkit, i am successful to train and convert our classify model (mobilenet_v2) to tensorrt engine.
Howerver, After read a lot of document provide by nvidia and many issues from this forum, i am still confuse to get dynamic batching.

./tao-converter -k $KEY -c final_model_int8_cache.bin -d 3,224,224 -i nchw final_model_mobilenetv2.etlt -e cvt_mobilenetspoofing_int8.engine -m 8 -b 4 -t int8 -o predictions/Softmax

With above command, our model only accept to set the context to just 1 image (batch = 1)

context.set_binding_shape(0, (3, 224, 224))

In the diffient trial, i set -d x,3,224,224 (x is interger number ex:1,2,3,4,5,…). It is ok, howerver, i only got fixed batch for our model.

Can you please guide me how to get dynamic batching for classify model?

Can you try to deploy the engine into triton inference server?
Refer to

hello @Morganh. Thanks for your response.
For some reason, tensorrt engine is better than triton in my case.
So i am finding a way to get dynamic batching for this model.

No, for your case, dynamic batching is not supported.

1 Like

Thank you very much!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.