Offline data augmentation for maskrcnn

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Mask_rcnn)
• Configuration of the TAO Toolkit Instance
dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 2.0
toolkit_version: 3.21.11
published_date: 11/08/2021

• Augmentation spec file attached as
specs_augment.txt (667 Bytes)

• How to reproduce the issue ?
What is the correct way to create augmentations of a custom dataset.
I have a dataset in COCO format and I convert it to tfrecords for training. Since the dataset is very small I want to add more augmentations for these images, so I was following the steps here offline_data_augmentation .
The command I use is:
tao augment -a /workspace/dataset/specs_augment.txt -o /workspace/dataset/cherry/augmented/ -v -d /workspace/dataset/cherry/train
I run into the error:
google.protobuf.text_format.ParseError: 2:3 : Message type "DatasetConfig" has no field named "data_sources".
What is the correct replacement for DatasetConfig, i.e. the correct way to specify the path to tfrecords? What is the difference between specifying the dataset using -d in the arg verses specifying it in the spec file as image_directory_path?
Is there a sample file for offline augmentation for maskrcnn which I could refer to?

Could you refer to the jupter notebook along with its spec files via the guide in
TAO Toolkit Quick Start Guide — TAO Toolkit 3.22.05 documentation ?

Thank you for your reply @Morganh . I did check the jupyter notebook but I couldn’t find the examples about offline augmentation in these.

For offline augmentation I tried the spec file given here: running the augmenter tool but it just gives the output:
2022-05-06 13:59:45,110 [INFO] tlt.components.docker_handler.docker_handler: Stopping container. without generating any augmented images.

There is an option to augment_input_data: True in the spec file to augment while training, but is there a way to modify these augmentations like in the augmentation_config for FasterRCNN ?

There is spec files after you download jupyter notebook.
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/cv_samples/versions/v1.3.0/zip -O cv_samples_v1.3.0.zip
unzip -u cv_samples_v1.3.0.zip -d ./cv_samples_v1.3.0 && rm -rf cv_samples_v1.3.0.zip && cd ./cv_samples_v1.3.0

Thank you for your reply @Morganh I found the example files for offline augmentation in cv_samples_v1.3.0 . From what I understand it needs data in kitti format, whereas the data I have is in coco or tfrecords format. So maybe offline augmentation fails because of that.
It would help me if I could do online augmentation instead since it is also supposed to give better results, what is the correct format to specify the augmentation_config specs?
I followed the example given in cv_samples_v1.3.0/unet/specs/unet_train_resnet_unet_isbi.txt which specifies it as:
dataset_config { dataset: "custom" augment: False augmentation_config {
and I run into the error:
google.protobuf.text_format.ParseError: 42:1 : Message type “Experiment” has no field named “dataset_config”.
2022-05-10 10:27:19,681 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

If I follow the syntax example here cv_samples_v1.3.0/, :
augmentation_config { output_width: 96 output_height: 48 output_channel:
I run into the error:
google.protobuf.text_format.ParseError: 42:1 : Message type "Experiment" has no field named "augmentation_config". 2022-05-10 10:38:45,078 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I couldn’t find a specific syntax for online augmentation for maskrcnn. I am using the command from the maskrcnn examply jupyter notebook from the samples to train :
tao mask_rcnn train -e $SPECS_DIR/specs_cherry_augment.txt \ -d $USER_EXPERIMENT_DIR/experiment_dir_unpruned_augmented\ -k $KEY \ --gpus 1

For maskrcnn, there is a parameter as below.
augment_input_data: True

It is online augmentation for maskrcnn.

Thank you @Morganh . I have set this but how can I customize this? If I want to add something like how it is done in the file unet/specs/unet_train_resnet_isbi.txt for online training in maskrcnn :

  augmentation_config {
    spatial_augmentation {
    hflip_probability : 0.5
    vflip_probability : 0.5
    crop_and_resize_prob : 0.5
  }
  brightness_augmentation {
    delta: 0.2
  }

Currently there is not specific parameter for customize. I will sync with internal team for your request.