TRITON's config.pbtxt only accepts 3dim input layers?

The input of my PyTorch model (denset121) is [10, 3, 224, 224] for a single batch [patch, color channel, height, width]. The output is [10, 3] which is a three-class prediction for each patch. How should my config.pbtxt look like? The issue that I’m facing (whether I use ONNX or torch.jit) is that I have more than three dimensions, so I cannot use the image_client.py. Thank you!

So I tried:

As per my previous post, I get a “dim!=3” error in the image_client.py - which I’m using as a base.

Moving to Inference server forum so that TRITON inference server team can take a look.

Hi @bart.michiels

My understanding is that when the input of your PyTorch model is [10, 3, 224, 224], first dimension(‘patch’) is for batch (in this case, batch-size is 10)?
Assuming that the length of patch (batch size) is arbitrary, you can use the following config.pbtxt

platform: "onnxruntime_onnx"
max_batch_size: 32
input [
  {
    name: "input"
    data_type: TYPE_FP32
    dims: [3, 224, 224]
  }
]
output [
  {
    name: "output"
    data_type: TYPE_FP32
    dims: [3]
  }
]

Please refer to https://github.com/gigony/triton-test-onnx that can help how to use it.

If your model doesn’t support batched inputs and have a fixed patch size (10), your config.pbtxt would looks like below

platform: "onnxruntime_onnx"
max_batch_size: 0
input [
  {
    name: "input"
    data_type: TYPE_FP32
    dims: [10, 3, 224, 224]
  }
]
output [
  {
    name: "output"
    data_type: TYPE_FP32
    dims: [10, 3]
  }
]

Please see https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/model_configuration.html?highlight=max_batch_size#model-configuration.

1 Like