When I give the command line argument for the images, do I need to resize them before hand or will it automatically be taken care of the match the input size of the model?
If you want to run tlt-train at Detection network, it is necessary to resize the images/labels offline.
See tlt user guide for more details.
Object Detection
DetectNet_v2
Input size: C * W * H (where C = 1 or 3, W > =480, H >=272 and W, H are multiples of 16) Image format: JPG, JPEG, PNG Label format: KITTI detectionNote: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.
FasterRCNN
Input size: C * W * H (where C = 1 or 3; W > =160; H >=160) Image format: JPG, JPEG, PNG Label format: KITTI detectionNote: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.
SSD
Input size: C * W * H (where C = 1 or 3, W >= 128, H >= 128, W, H are multiples of 32) Image format: JPG, JPEG, PNG Label format: KITTI detectionNote: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.
DSSD
Input size: C * W * H (where C = 1 or 3, W >= 128, H >= 128, W, H are multiples of 32) Image format: JPG, JPEG, PNG Label format: KITTI detectionNote: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.
YOLOv3
Input size: C * W * H (where C = 1 or 3, W >= 128, H >= 128, W, H are multiples of 32) Image format: JPG, JPEG, PNG Label format: KITTI detectionNote: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.
RetinaNet
Input size: C * W * H (where C = 1 or 3, W >= 128, H >= 128, W, H are multiples of 32) Image format: JPG, JPEG, PNG Label format: KITTI detectionNote: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.
For tlt-infer, it is not needed to resize.