TAO input image resizing

p.vahidinia · March 16, 2022, 10:45am

Hi,
I have questions about input image resizing in TAO.
1)What operation is done to resize input images?
2) Is this operation performed on CPU or GPU?
3)In spec files, there are some suggestions for input image shapes. What are the reasons and concerns for these suggestions? Model accuracy? speed? or network design limitations?
4)Does resizing keep aspect ratio?
5) Are similar operation performed for image preprocessing during validation and inference?

Morganh · March 16, 2022, 4:53pm

For above questions, may I know which network will you focus on?

p.vahidinia · March 16, 2022, 5:13pm

Object detections: yolov3, yolov4, tiny yolov4, ssd, faster rcnn, retinanet

Segmentations: mask rcnn, unet

p.vahidinia · March 17, 2022, 4:24am

Hi. I have another question:
The online augmentation includes resizing the images or the images are cropped or pads with zeros?

Morganh · March 17, 2022, 5:02pm

You can check Augmentation Config for the online data augmentation. Usually the augmentation module provides some basic pre-processing and augmentation when training.
For example, in SSD/DSSD/retinanet,
The augmentation_config parameter defines the image size after preprocessing. The augmentation methods in the SSD paper will be performed during training, including random flip, zoom-in, zoom-out and color jittering. And the augmented images will be resized to the output shape defined in augmentation_config . In evaluation process, only the resize will be performed.

p.vahidinia · March 19, 2022, 5:08pm

Thank you so much.
please help with questions 2 and 3 as well.

Morganh · March 20, 2022, 2:25pm

The augmentation performs on GPU.
The “Input size requirement” is due to requirement of the network. For example, for yolov4 network, the input image resolution needs a resolution of multiples of 32.

p.vahidinia · April 3, 2022, 4:37am

Thank you. I understand network requirements. I mean the default sizes written in the spec files. For example, this config is used for yolov4 spec file:

output_width: 1248
output_height: 384

Morganh · April 4, 2022, 4:56am

It depends on your dataset. In jupyter notebook, it will train public KITTI dataset. This dataset contains 1248x384 images. So, the spec file set to 1248 and 384.

system · April 18, 2022, 4:56am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Network Image Input Resizing TAO Toolkit	7	871	October 12, 2021
Relationship between training dataset size and inference data size TAO Toolkit	12	757	February 22, 2022
Preprocessing Images TAO Toolkit	2	422	October 12, 2021
TAO augmentation question TAO Toolkit	2	517	August 12, 2022
Tao input size yolo_V4 TAO Toolkit	6	461	October 4, 2022
Question about augmentation section of training configurations TAO Toolkit	8	380	October 12, 2021
Optimal width and height of the images TAO Toolkit	4	492	December 5, 2021
Questions regarding the preparation of images for training yolo_v4 model on TAO toolkit TAO Toolkit	5	603	January 17, 2024
TLT yolo_v4 image resizer during evaluation TAO Toolkit	2	399	October 12, 2021
Tlt-infer - input image TAO Toolkit	3	620	October 12, 2021

TAO input image resizing

Related topics