Optimal width and height of the images

p.vahidinia · November 20, 2021, 10:12am

What is the optimal width and height of the images for the following models?
object detections: yolo_v3, yolo_v4, retinanet, ssd and faster_rcnn
segmentations: unet and mask_rcnn
classifications

Morganh · November 20, 2021, 2:39pm

Do you mean the resolution of the training images or the input_size of the model you want to train?

p.vahidinia · November 20, 2021, 6:24pm

please tell me about both of them

Morganh · November 21, 2021, 2:34am

For input_size of the model

Faster_rcnn : see FasterRCNN — TAO Toolkit 3.22.05 documentation

Input size : C * W * H (where C = 1 or 3, W >= 128, H >= 128)

yolo_v3 or yolo_v4: see YOLOv3 — TAO Toolkit 3.22.05 documentation

Input size : C * W * H (where C = 1 or 3, W >= 128, H >= 128, W, H are multiples of 32)

ssd: see SSD — TAO Toolkit 3.22.05 documentation

Input size : C * W * H (where C = 1 or 3, W >= 128, H >= 128)

retinanet: see RetinaNet — TAO Toolkit 3.22.05 documentation

Input size : C * W * H (where C = 1 or 3, W >= 128, H >= 128, W, H are multiples of 32)

Mask_rcnn: see https://docs.nvidia.com/tao/tao-toolkit/text/instance_segmentation/mask_rcnn.html#input-requirement

Input size : C * W * H (where C = 3, W >= 128, H >= 128 and W, H are multiples of 2^ max_level )

Unet: see UNET — TAO Toolkit 3.22.05 documentation

Input size : C * W * H (where C = 3 or 1, W = 572, H = 572 for vanilla unet and W >= 128, H >= 128 and W, H are multiples of 32 for other archs).

For resolution of the training image

Faster_rcnn: see FasterRCNN — TAO Toolkit 3.22.05 documentation , with static input shape, we can offline resize the images to the target resolution or we can enable automatic resize during training.

yolo_v3, yolo_v4, retinanet, ssd : do not need to resize images/labels. There is automatic resizing during training.

Mask_rcnn or Unet: The images and masks need not be equal to model input size. The images/ masks will be resized to the model input size during training.

system · December 5, 2021, 2:35am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Maximum input dimensions TAO Toolkit	2	341	July 18, 2023
Tuning of Parameters TAO Toolkit	13	595	November 24, 2021
TAO input image resizing TAO Toolkit	9	1176	April 18, 2022
Tlt-infer - input image TAO Toolkit	3	607	October 12, 2021
Questions regarding the preparation of images for training yolo_v4 model on TAO toolkit TAO Toolkit	5	526	January 17, 2024
Queries regarding Anchor shape and Resizing of image TAO Toolkit	3	448	December 21, 2021
Is resize needed for training a classification model? TAO Toolkit	2	305	September 28, 2022
Network Image Input Resizing TAO Toolkit	7	837	October 12, 2021
Resnet18 Object Detection Image Resolution Problem TAO Toolkit	6	1387	October 12, 2021
Image size TAO Toolkit	4	300	October 15, 2023

Optimal width and height of the images

Related topics