Clipping Annotations at Tile Edges for Mask2former Instance Segmentation

Hello,

I’m setting up a pipeline to train mask2former in TAO, and since recommended by Nvidia as well as seeing performance boosts, I want to include a tiling preprocessor. However, since Mask2former requires COCO jsons (segmentation + bbox) instead of segmentation mask pngs, I need to write logic to tile the images as well as the segmentations and bounding boxes associated with each tile. Do you happen to have a preprocessor that achieves this and keeps the integrity of the annotations? I’m finding there are many edge cases that need to be accounted for making the logic complex.
A follow up question, how much does the bounding box accuracy matter for instance segmentation training in mask2former? If I’m tiling annotations and the resulting bounding boxes sometime trace my tile/slice edges, will that make a difference in the model performance? My alternative is to compute the bounding boxes from the rasterized segmentation mask binaries which is very expensive.

Hi @kianmehr.ehtiatkar2
Sorry for late response. You can refer to Crop image but still mapping annotation · ashnair1/COCO-Assistant · Discussion #35 · GitHub to try.

from sahi.slicing import slice_coco

coco_dict, coco_path = slice_coco(
    coco_annotation_file_path="/path/to/annotations.json",
    image_dir="/path/to/images",
    output_dir="/path/to/sliced_out",
    output_coco_annotation_file_name="instances_sliced",  # not need to add .json
    slice_height=512,
    slice_width=512,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
    ignore_negative_samples=True,
    min_area_ratio=0.2,
    verbose=False,
)
print("saved:", coco_path)