I am creating and experiment spec file for training an object detection model using tlt. However, I do not fully understand some of the parameters. What exactly means zoom_min and zoom_max?
I think that values above 1 do cropping (just like tf.image.crop_and_resize), and values below 1 do padding, but it’s no clear enough for me.
For instance, using the following configuration, do I enlarge or shrink the bounding boxes?
augmentation_config {
preprocessing {
output_image_width: 960
output_image_height: 544
min_bbox_width: 1.0
min_bbox_height: 1.0
output_image_channel: 3
}
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 2.0
translate_max_x: 8.0
translate_max_y: 8.0
}