Max_Batch_Size Triton Server

AI & Data Science Deep Learning (Training & Inference) Frameworks

tan.shihao October 11, 2022, 8:39am 1

Hi,

I face an issue with the max_batch_size when trying to host pretrained resnet18 model using triton server. Setting max_batch_size larger than 0 causes triton server failed to launch.

For better clarity, kindly refer below for the steps that I took:

Do onnx to trt conversion using nvcr.io/nvidia/tensorrt:22.09-py3
Launch triton from the NGC Triton container (nvcr.io/nvidia/tritonserver:22.09-py3)

Snippets below show the error msg

Thanks!

Topic		Replies	Views
Need help authoring Model configuration for Pytorch MNIST Triton Inference Server - archived pytorch	2	969	September 29, 2021
Model tensor shape configuration hints for dynamic batching but the underlying engine doesn't support batching Triton Inference Server - archived	4	2303	October 12, 2021
Trt file from onnx is too large TensorRT	1	888	March 10, 2021
TRITON's config.pbtxt only accepts 3dim input layers? Triton Inference Server - archived tensorrt , pytorch	4	1644	October 12, 2021
Tensorrt, convert pytorch onnx module dynamic batch failed General Topics and Other SDKs tensorrt , ubuntu	0	601	February 11, 2022
The default value of engine.max_batch_size is 32? TensorRT	4	1737	October 12, 2021
Multi inputs for ONNX models with batch_size=1 DeepStream SDK python , onnx , inference-server-triton , deepstream	12	1129	March 6, 2023
Myelin memory budget exceeded while building TensorRT engine with batch > 1 TensorRT tensorrt	4	927	October 12, 2021
Trtserver crashes! Triton Inference Server - archived	0	685	April 10, 2020
Load ONNX model with batch size TensorRT	3	1701	October 12, 2021

Max_Batch_Size Triton Server

Related topics