The input of my PyTorch model (denset121) is [10, 3, 224, 224] for a single batch [patch, color channel, height, width]. The output is [10, 3] which is a three-class prediction for each patch. How should my config.pbtxt look like? The issue that I’m facing (whether I use ONNX or torch.jit) is that I have more than three dimensions, so I cannot use the image_client.py. Thank you!
So I tried:
As per my previous post, I get a “dim!=3” error in the image_client.py - which I’m using as a base.
Moving to Inference server forum so that TRITON inference server team can take a look.
My understanding is that when the input of your PyTorch model is [10, 3, 224, 224], first dimension(‘patch’) is for batch (in this case, batch-size is 10)?
Assuming that the length of patch (batch size) is arbitrary, you can use the following config.pbtxt
platform: "onnxruntime_onnx"
max_batch_size: 32
input [
{
name: "input"
data_type: TYPE_FP32
dims: [3, 224, 224]
}
]
output [
{
name: "output"
data_type: TYPE_FP32
dims: [3]
}
]
Please refer to GitHub - gigony/triton-test-onnx: Sample Application for TRITION with ONNX model (to help https://forums.developer.nvidia.com/t/tritons-config-pbtxt-only-accepts-3dim-input-layers/123811) that can help how to use it.
If your model doesn’t support batched inputs and have a fixed patch size (10), your config.pbtxt would looks like below
platform: "onnxruntime_onnx"
max_batch_size: 0
input [
{
name: "input"
data_type: TYPE_FP32
dims: [10, 3, 224, 224]
}
]
output [
{
name: "output"
data_type: TYPE_FP32
dims: [10, 3]
}
]
Please see Documentation - Latest Release :: NVIDIA Deep Learning Triton Inference Server Documentation.

