Wrong Offline Data Augmentation Documentation

kyuan2023 · January 4, 2024, 7:45am

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) A4000
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) N/A
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) toolkit_version: 5.2.0
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

As the latest Tao doesn’t support rotation/shear online augmentation, I will have to do offline augmentation. There are two issues:

this 5.2 linked offline augmentation is using a deprecated example command “tao augment” : Offline Data Augmentation - NVIDIA Docs. It took me a while to figure out that tao dataset augmentation generate is the corrected one. Please update this
even with tao dataset augmentation generate, if I do tao dataset augmentation generate --help, the args are very different from what you have put here: Offline Data Augmentation - NVIDIA Docs. Again, this is very misleading
after spending many hours, I figured a working command but cannot get the yaml parsed correctly.

there is no example provided in Offline Data Augmentation - NVIDIA Docs
the example provided in Offline Data Augmentation - NVIDIA Docs doesn’t work

Below is what I have tried but didn’t work:

spatial_aug:
  rotation:
    angle: 5
    units: degrees
  shear:
    shear_ratio_x: 0.3
data:
  dataset_type: coco
  image_dir: /workspace/tao/data/images
  anno_path: /workspace/tao/data/output.json
  output_dataset: /workspace/tao/data/out
  batch_size: 8
  include_masks: false

It throws weird errors:

2024-01-04 02:35:24,610 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2024-01-04 02:35:24,691 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.2.0-data-services
2024-01-04 02:35:25,298 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
sys:1: UserWarning: 
'offline_data_augment.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
sys:1: UserWarning: 
'offline_data_augment.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
/usr/local/lib/python3.10/dist-packages/nvidia_tao_ds/core/hydra/hydra_runner.py:105: UserWarning: 
'offline_data_augment.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  _run_hydra(
/usr/local/lib/python3.10/dist-packages/nvidia_tao_ds/core/hydra/hydra_runner.py:105: UserWarning: 
'offline_data_augment.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  _run_hydra(
Error merging 'offline_data_augment.yaml' with schema
Invalid value assigned: AnyNode is not a ListConfig, list or tuple.
    full_key: spatial_aug.rotation.angle
    reference_type=RotationConfig
    object_type=RotationConfig

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Error merging 'offline_data_augment.yaml' with schema
Invalid value assigned: AnyNode is not a ListConfig, list or tuple.
    full_key: spatial_aug.rotation.angle
    reference_type=RotationConfig
    object_type=RotationConfig

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[740,1],0]
  Exit code:    1
--------------------------------------------------------------------------
Sending telemetry data.
Execution status: FAIL
2024-01-04 02:35:31,674 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.

Morganh · January 5, 2024, 6:42am

Thanks for the catching. The info is not updated for TAO5.x. Will update it. Currently, it needs tao dataset to launch. You can also run docker run nvcr.io/nvidia/tao/tao-toolkit:5.2.0-data-services to login the docker.
Other info can be found in TAO Toolkit Launcher - NVIDIA Docs.

The info is not updated for TAO5.x. The help info is as below.

augmentation -h
usage: augmentation [-h] -e EXPERIMENT_SPEC_FILE [--gpu_ids GPU_IDS] [--num_gpus NUM_GPUS] [-o OUTPUT_SPECS_DIR] [--mpirun_arg MPIRUN_ARG] [--launch_cuda_blocking] {generate}

Its source code is tao_dataset_suite/nvidia_tao_ds/augment/entrypoint/augment.py at main · NVIDIA/tao_dataset_suite · GitHub.

The angle needs to be a list according to tao_dataset_suite/nvidia_tao_ds/augment/config/default_config.py at main · NVIDIA/tao_dataset_suite · GitHub. Please change angle and retry.
You can refer to tao_tutorials/notebooks/tao_data_services/specs/augment.yaml at main · NVIDIA/tao_tutorials · GitHub

More info can refer to
tao_tutorials/notebooks/tao_data_services/kitti.ipynb at main · NVIDIA/tao_tutorials · GitHub.

kyuan2023 · January 5, 2024, 7:13am

Thanks, I will give it a try.

Another questions:

it is quite inconvenient to do rotation/shear augmentation offline, and do you have any plan to make it part of online operations ?
there are other very useful data augmentations, such as copy-paste. Any plan to add support to that ?
Is that part of the code open source any where ? any chance the community can contribute ?

Morganh · January 5, 2024, 7:36am

All the code are open source. You can find in the bottom of NVIDIA Corporation · GitHub. You may find corresponding docker in TAO Toolkit | NVIDIA NGC.
Online augmentation are already in the network.

kyuan2023 · January 5, 2024, 5:23pm

Thanks, this is very helpful.

If I would like to use my own fork of GitHub - NVIDIA/tao_tensorflow1_backend: TAO Toolkit deep learning networks with TensorFlow 1.x backend, with some customized enhancements, would I be able to build it my own and use the taokit to wrap it ? Any documentations about this ?

Morganh · January 6, 2024, 9:54am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Yes, you can.
You can login the docker, then modify the file. Last, run docker commit to save your custom docker.

system · January 23, 2024, 1:34am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error in TAO-Toolkit while training TAO Toolkit	15	1512	July 6, 2022
Error using tao augment command TAO Toolkit	9	544	April 28, 2023
Tao pre-trained yolo4tiny - AssertionError: Must have more boxes than clusters TAO Toolkit	54	2280	January 21, 2022
Offline data augmentation for maskrcnn TAO Toolkit	7	841	May 11, 2022
An error occurred while preparing the data set using TAO TAO Toolkit	14	1378	October 19, 2021
Tao toolkit version5 is getting error when comes to training part TAO Toolkit	45	1718	August 22, 2023
Deformable detr model keeps failing to train TAO Toolkit	5	537	February 1, 2024
TAO toolkit happend some .so bug TAO Toolkit tao	19	906	September 9, 2022
TAO Toolkit Version 5.3 - Segformer ValueError: need at least one array to concatenate TAO Toolkit	14	638	April 16, 2024
Classification_pyt error TAO Toolkit jetson	16	96	September 18, 2024

Wrong Offline Data Augmentation Documentation

Related topics