TAO-Data-Services augmented bboxes are incorrect

Using tao data services to generate augmentations seem to not produce correctly interpolated/transformed bounding boxes.
Here is a section of the original image with its mask and bbox overlaid:


Here is the same section in the augmented image:
image

You can see that the bounding box does not quite extend to the edge of the mask. Is this a known issue, or am I missing a key parameter while producing the augmentations? I’ve attached the augmentation spec file for reference.
aug_spec.txt (935 Bytes)

Here is my code to visualize the rle maps in case you need to reproduce the issue:

import os
import json
import random
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from pycocotools import mask as maskUtils

def visualize_coco_rle(json_path, images_dir, num_samples=5, seed=42):
    # Load COCO data
    with open(json_path, "r") as f:
        coco = json.load(f)
    images = coco["images"]
    annotations = coco["annotations"]

    # Build mapping
    img_id2info = {img["id"]: img for img in images}
    img_id2anns = {}
    for ann in annotations:
        img_id2anns.setdefault(ann["image_id"], []).append(ann)

    # Pick a sample of images to visualize
    random.seed(seed)
    samples = [img for img in images if os.path.exists(os.path.join(images_dir, img["file_name"]))]
    if len(samples) == 0:
        print("No images found in directory!")
        return
    samples = random.sample(samples, min(num_samples, len(samples)))

    for img_info in samples:
        img_path = os.path.join(images_dir, img_info["file_name"])
        anns = img_id2anns.get(img_info["id"], [])
        if not os.path.exists(img_path):
            print(f"Image not found: {img_path}")
            continue
        image = np.array(Image.open(img_path).convert("RGB"))
        mask_total = np.zeros((img_info["height"], img_info["width"]), dtype=np.uint8)

        # Overlay each mask in a different color
        plt.figure(figsize=(10, 8))
        plt.imshow(image)
        plt.axis("off")
        for ann_idx, ann in enumerate(anns):
            rle = ann["segmentation"]
            m = maskUtils.decode(rle)
            # Random color per instance
            color = np.random.rand(3,)
            mask_bool = m.astype(bool)
            # Overlay mask
            plt.imshow(np.dstack([mask_bool*color[0], mask_bool*color[1], mask_bool*color[2]]), alpha=0.4)
            # Optionally, show the bbox
            x, y, w, h = ann["bbox"]
            plt.gca().add_patch(plt.Rectangle((x, y), w, h, fill=False, edgecolor=color, linewidth=2))

        plt.title(img_info["file_name"])
        plt.show()

if __name__ == "__main__":
    augmented_json_path = "output.json"
    augmented_image_dir="images"
    visualize_coco_rle(
        json_path=augmented_json_path,            
        images_dir=augmented_image_dir,             
        num_samples=5                          
    )

Could you try to

  rotation:
    angle: [-10, 10]
    units: degrees
    refine_box:
        enabled: True
        gt_cache: /path/to/masks

You can also refer to tao_dataset_suite/nvidia_tao_ds/augment at main · NVIDIA/tao_dataset_suite · GitHub to debug inside the docker.
$ tao dataset run /bin/bash
or docker run nvcr.io/nvidia/tao/tao-toolkit:5.5.0-data-services /bin/bash

This did not solve the issue. Is this a known bug that should have an issue associated with it? I’d like to not spend time debugging this if a solution will be provided with the next update.

Additionally,

  • Is there a way to increase the number of augmented images produced per original image?
  • And is it possible to append the augmented image names with a string to differentiate them from the original image and be able to combine with the original dataset?

I will try to reproduce your result. Could you share the original image with its mask? Thanks.

There is online augmentation for mask2former. Refer to Fine-Tune the TAO v5.5.0 Mask2former Instance segmentation model on a custom dataset - #2 by Morganh.

Unfortunately I cannot share the original images due to proprietary reasons. My intention isn’t to manipulate the source code or run commands from within the containers. I will ultimately want to set these up as services, so ad-hoc changes are not sustainable. Please let me know if you can replicate the issue with a dataset of your own. I’ve attached the annotations here for your reference. You should be able to replace original images with images of your own with the same format. I’ve tried both using polygons as well as RLEs for this operation, and both have had the same results.
val_coco_annotations.txt (262.0 KB)
val_coco_annotations_rle.txt (141.4 KB)

I am using some coco2017 dataset to try to reproduce. Will update to you once I have the results.

Even with no rotation/shear/translation/flip as below,

# aug_spec.yaml
random_seed: 42
num_gpus: 1
gpu_ids: [0]
results_dir: aug_results

spatial_aug:
  rotation:
    angle: [-0.000002, 0.000002]
    units: degrees
    refine_box:
      enabled: True
      gt_cache: /localhome/local-morganh/forum_aug/annotations/label/single_263403.json
  shear:
    shear_ratio_x: [0.0]
    shear_ratio_y: [0.0]
  translation:
    translate_x: [0]           # pixels
    translate_y: [0]           # pixels
  flip:
    flip_horizontal: false #true
    flip_vertical: false #true


blur_aug:
  size: [3]
  std: [0.5]

data:
  dataset_type: coco
  image_dir: /localhome/local-morganh/forum_aug/annotations/image
  ann_path: /localhome/local-morganh/forum_aug/annotations/label/single_263403.json
  # output_dataset: /data/augmented-data/train
  output_image_width: 638
  output_image_height: 394
  batch_size: 1
  include_masks: true

$ augmentation generate -e /localhome/local-morganh/forum_aug/annotations/aug_spec.yaml

I can reproduce the misalignment from the bbox. Will check further.

1 Like

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.