Description
So I am trying to enable dynamic batching on my .onnx Yolo model. I used a third-party library (Ultralytics) to export a YOLOv11 model to ONNX format. Then, I transformed the inputs and the outputs to allow for dynamic batching. Now, the shape of the input and outputs are as follows:
Input: images Shape: [-1, 3, 640, 640]
Output: output0 Shape: [-1, 7, 8400]
However, whenever I try to start the Triton server, it gives me this error:
UNAVAILABLE: Invalid argument: model ‘yolo_v1’, tensor ‘output0’: the model expects 3 dimensions (shape [1,7,8400]) but the model configuration specifies 3 dimensions (shape [-1,7,8400]).
Is there anything else I need to do in order to allow for dynamic batching?
Thanks.
Environment
Using Triton version 24.09
Relevant Files
Here is the code I used to check the dimensions of the inputs and outputs:
import onnx
model_path = "/home/model_repository/yolo_v1/1/model.onnx" # Update with your actual model path
model = onnx.load(model_path)
for inp in model.graph.input:
print(f"Input: {inp.name} Shape: {[dim.dim_value if dim.dim_value > 0 else -1 for dim in inp.type.tensor_type.shape.dim]}")
for out in model.graph.output:
print(f"Output: {out.name} Shape: {[dim.dim_value if dim.dim_value > 0 else -1 for dim in out.type.tensor_type.shape.dim]}")
Here is what I used to convert my model to allow for dynamic batching:
import onnx
# Load the ONNX model
model_path = "/home/model_repository/yolo_v1/1/model.onnx"
model = onnx.load(model_path)
# Update input dimensions (set batch size to dynamic for all inputs)
for input_tensor in model.graph.input:
input_tensor.type.tensor_type.shape.dim[0].dim_value = -1
input_tensor.type.tensor_type.shape.dim[0].dim_param = "batch_size" # Set as dynamic
# Update output dimensions (set batch size to dynamic for all outputs)
for output_tensor in model.graph.output:
output_dims = output_tensor.type.tensor_type.shape.dim
# Ensure first dimension (batch size) is dynamic (-1)
output_dims[0].dim_value = -1
output_dims[0].dim_param = "batch_size" # Set as dynamic for the batch dimension
# Ensure all other dimensions are correctly set (if necessary)
for dim in output_dims[1:]:
dim.dim_value = dim.dim_value # Keep the existing size for other dimensions
# Save the updated model
updated_model_path = "yolo_v1_dynamic.onnx"
onnx.save(model, updated_model_path)
print(f"Updated model saved to {updated_model_path}")
Here is my config file:
name: "yolo_v1"
platform: "onnxruntime_onnx"
input [
{
name: "images"
data_type: TYPE_FP32
dims: [ -1, 3, 640, 640 ] # Since the resolution of your image is 115 by 133
}
]
output [
{
name: "output0"
data_type: TYPE_FP32
dims: [ -1, 7, 8400 ] # Adjust based on YOLO's output format
}
]
instance_group [
{
count: 15
kind: KIND_GPU
}
]
dynamic_batching { }
Here is the code I use to start the Triton server:
docker run --gpus all -it --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.09-py3 tritonserver --model-repository=/models