I am using model analyzer to profile object detection models for various Configuration
Example:
max_batch_size = 32
…
dynamic_batching {
preferred_batch_size: [ 4, 8 ]
}
In this example model supported max batch size of 32. And server attempts to create a batch size of 4 and 8 while performing inference. However there is a static batch size parameter that I do not understand fully. What if I apply the config above with the --batch-sizes 2,4,8,16,32 as follows.
model-analyzer profile
–model-repository …/models
–profile-models feature-extractor
–triton-launch-mode=docker
–output-model-repository-path …/logs/profile
–batch-sizes 4,8,16,32
–concurrency 2,4,8,16,32
–run-config-search-max-instance-count 2
–client-protocol ‘http’
–perf-output true
–config config.yaml
In this case dynamic batching enabled with the preferred batch sizes and max batch size. But I can not relate these with batch_sizes that I apply here.