What is the difference between batch-size, preferred batch size and max batch size in triton server model analyzer?

shahidalipo09 · July 30, 2022, 12:21pm

I am using model analyzer to profile object detection models for various Configuration

Example:

max_batch_size = 32
…
dynamic_batching {
preferred_batch_size: [ 4, 8 ]
}

In this example model supported max batch size of 32. And server attempts to create a batch size of 4 and 8 while performing inference. However there is a static batch size parameter that I do not understand fully. What if I apply the config above with the --batch-sizes 2,4,8,16,32 as follows.

model-analyzer profile
–model-repository …/models
–profile-models feature-extractor
–triton-launch-mode=docker
–output-model-repository-path …/logs/profile
–batch-sizes 4,8,16,32
–concurrency 2,4,8,16,32
–run-config-search-max-instance-count 2
–client-protocol ‘http’
–perf-output true
–config config.yaml

In this case dynamic batching enabled with the preferred batch sizes and max batch size. But I can not relate these with batch_sizes that I apply here.

rvinobha · August 2, 2022, 3:37pm

Hi @shahidalipo09

Thanks for your interest in Riva

Apologies, We only handle Riva related queries in this forum

We request you to kindly address your queries related in the below Github link

Thanks

Topic		Replies	Views
Nvinferserver (Triton server) doesn't improves inference FPS for dynamic batching models DeepStream SDK	2	411	October 25, 2023
TensorRT 5.X / 6.X Batch Size Problem TensorRT	4	678	August 19, 2020
Identifying the Best AI Model Serving Configurations at Scale with NVIDIA Triton Model Analyzer Technical Blog	0	438	May 23, 2022
Test triton with jmeter, much less throughoutput than perf-analyzer TensorRT inference-server-triton	1	534	November 15, 2023
Performance issue with dynamic batching on Triton Inference Server Triton Inference Server (archived) tensorflow , inference-server-triton	0	2544	May 6, 2021
Model tensor shape configuration hints for dynamic batching but the underlying engine doesn't support batching Triton Inference Server (archived)	4	2581	October 12, 2021
Input batch size is smaller than TensorRT engine batch size TensorRT	1	1052	March 28, 2022
Question about maxbatchsize in TRT7 TensorRT tensorrt	4	1449	October 12, 2021
TensorRT builder->setMaxBatchSize(maxBatchSize); question Jetson TX2	9	6724	October 18, 2021
Latency linearly increases when increased batch size or concurrent models TensorRT inference-server-triton	15	2269	September 29, 2021

What is the difference between batch-size, preferred batch size and max batch size in triton server model analyzer?

Related topics