What is the difference between batch-size, preferred batch size and max batch size in triton server model analyzer?

I am using model analyzer to profile object detection models for various Configuration


max_batch_size = 32

dynamic_batching {
preferred_batch_size: [ 4, 8 ]

In this example model supported max batch size of 32. And server attempts to create a batch size of 4 and 8 while performing inference. However there is a static batch size parameter that I do not understand fully. What if I apply the config above with the --batch-sizes 2,4,8,16,32 as follows.

model-analyzer profile
–model-repository …/models
–profile-models feature-extractor
–output-model-repository-path …/logs/profile
–batch-sizes 4,8,16,32
–concurrency 2,4,8,16,32
–run-config-search-max-instance-count 2
–client-protocol ‘http’
–perf-output true
–config config.yaml

In this case dynamic batching enabled with the preferred batch sizes and max batch size. But I can not relate these with batch_sizes that I apply here.

Hi @shahidalipo09

Thanks for your interest in Riva

Apologies, We only handle Riva related queries in this forum

We request you to kindly address your queries related in the below Github link