FormatConverterOp: out_channel_order


I want to replace the shown python code with the holoscan format converter. Resizing and scaling to the desired values is working as expected but transposing is not. out_channel_order does not seem to change anything in the output. The formated tensor still has the dimensions (640, 640, 3). Do you have an idea how to fix that?

Input: [1920, 1080, 3]
Expected Output: [1, 3, 640, 640]


resize_width = 640
resize_height = 640
format_converter = FormatConverterOp(
    resize_width = resize_width,
    resize_height = resize_height,


# Transpose from (640, 640, 3) to (3, 640, 640) [HWC -> CHW] as expected from trt engine model
tensor = cp.transpose(tensor, axes=(2, 0, 1))

# Convert array to contiguous array
tensor = cp.ascontiguousarray(tensor,  dtype=cp.dtype(self.out_dtype))
# Add batch dimensions: expand (3, 640, 640) to (1, 3, 640, 640)
if tensor.ndim == 3:
    tensor = cp.expand_dims(a=tensor, axis=0)

tensor /= 255

Thanks in advance.

Hello, the out_channel_order parameter doesn’t permute the tensor shape itself, it permutes within the {R, G, B} / {R, G, B, A} channels so that wouldn’t apply to your use case.

With your feedback for the need for a permutation functionality, we’ll discuss internally on how to improve. In the meanwhile, there are two ways you could go about this:

(1) This looks like a formater converted as a preprocessor before inference. If so, as a workaround you could modify your ONNX model to include a reshape layer before the input to the model, please see an example for that in HoloHub applications/monai_endoscopic_tool_seg/scripts/ , especially line 54-60 for adding before the input layer.

(2) Write your own native Python Holoscan op for permuting and place it after the format converter in the application.

Let us know if the two options would unblock you for now!

Permuting and reshaping the tensors are besides scaling and resizing the most frequently used operations we do in preprocessing. Native support for these by the FormatConverterOp would be very convenient.

The operator is indeed the preprocesssing step before inference on a YOLO model. Since our models are used by multiple scripts adhering to the [B x C x H x W] format is easier for us than changing the models. The shown python code is already part of an operator which is used like described in your second point after resizing and scaling done by the FormatConverter and that works perfectly fine.

Thank you very much for the response.

Hi there, please note that approach (1) does not require you to change the model during your training. After training, you could use approach (1) to append additional layer(s) to the ONNX model before converting to TRT. Glad to know you’re already doing approach (2)!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.