Experiment Spec File: meaning of zoom_min and zoom_max

motyaedu · November 26, 2020, 11:30am

I am creating and experiment spec file for training an object detection model using tlt. However, I do not fully understand some of the parameters. What exactly means zoom_min and zoom_max?

I think that values above 1 do cropping (just like tf.image.crop_and_resize), and values below 1 do padding, but it’s no clear enough for me.

For instance, using the following configuration, do I enlarge or shrink the bounding boxes?

augmentation_config {
  preprocessing {
    output_image_width: 960
    output_image_height: 544
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    vflip_probability: 0.0
    zoom_min: 1.0
    zoom_max: 2.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }

Morganh · November 27, 2020, 6:35am

Below are the description.

zoom_min/max (float): Minimum/maximum zoom ratios. Set min = max = 1 to keep original size.
translate_max_x/y (float): Maximum translation along the x/y axis in pixel values.

Zoom operations are random.
Zoom operations stretch from (0,0) toward output_image_width/output_image_height
If zoom is randomly set to ~~below~~ above 1, you can consider it as ‘zooming out’ (image gets rendered smaller than the canvas).
If zoom is randomly set to ~~above~~ below 1, you can consider it as ‘zooming in’ (image gets rendered bigger than the canvas).

motyaedu · November 27, 2020, 8:35am

Thank you @Morganh, now it’s clear for me. However, I have another question related to your image, is the zoomed out image always extracted from the top left corner? do I need to set large values of translate_max_x/y in order to get crops of other regions of the image?

Morganh · November 30, 2020, 7:23am

Yes, Zoom operations stretch from (0,0) toward output_image_width/output_image_height.

For translate operations, they are also random. And please keep mind that translate and zoom are mutually independent.

The order of the spatial augmentation is:
Crop–>flip(if any)–> rotate (if any) → zoom (if any) -->translate (if any) -->crop to output_image_width/output_image_height

The first crop is the one defined with “crop_right” and “crop_bottom”. The second another crop is the one which will get the image to output dimensions.

motyaedu · November 30, 2020, 8:41am

Great! I just have one more question… Is the first crop fixed to 0:crop_bottom, 0:crop_right or is it taken random values between 0 and crop_bottom/crop_right? I mean, is it an augmentation or a global cropping of the dataset?

Morganh · November 30, 2020, 4:17pm

The first crop operation is a preprocessing operation to extract the region of interest that you would like to train on from your image in the dataset. We allow you to set crop_left, crop_top, crop_right, crop_bottom to define the left, right, top and bottom edges of the cropped ROI in the preprocessing section of the augmentation_config.
More info in Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation

motyaedu · November 30, 2020, 4:19pm

Perfect, thank you very match, now it is resolved.

AdithyaP · May 7, 2021, 12:39pm

@Morganh @motyaedu

I see that the docs mention -

The net tensors generated from the pre-processing blocks are then passed through a pipeline of random augmentations in spatial and color domains.

I have a few questions regarding augmentation_config:

Does this augmentation just modify original input images or add new images after applying specified operations?
Are all enabled spatial augmentation and color augmentation operations applied at once sequentially for every single image?

Morganh · May 8, 2021, 3:17am

Just apply specified operations on original input images.
The spatial and color transformation matrices are computed per image.

Topic		Replies	Views
Ability to augment with random crops? TAO Toolkit	4	619	October 12, 2021
DetectNet v2 dataset augmentation opposite zoom min/max TAO Toolkit	3	12	January 26, 2025
Data Augmentation output size TAO Toolkit	4	566	October 12, 2021
TAO input image resizing TAO Toolkit	9	1189	April 18, 2022
TAO Spec File Documentation TAO Toolkit documentation , tao , jetson	4	19	May 8, 2025
Offline augmentation problems TAO Toolkit	5	745	October 12, 2021
Questions regarding the preparation of images for training yolo_v4 model on TAO toolkit TAO Toolkit	5	575	January 17, 2024
Relationship between training dataset size and inference data size TAO Toolkit	12	698	February 22, 2022
TAO 4 Segformer Input and output dimensions and tensors TAO Toolkit	11	771	March 20, 2023
Data augmentation TAO Toolkit	2	398	October 12, 2021

Experiment Spec File: meaning of zoom_min and zoom_max

Related topics