Questions regarding the preparation of images for training yolo_v4 model on TAO toolkit

I plan to train a yolo v4 model using TAO toolkit, but I have questions regarding the preprocessing of my images.

  1. I will use my model on an hd video stream (1920x1080), is it still ok to have the input dim of the model smaller to have shorter train time ? (so for example If I train the model with 1364x768 images).

  2. If the input dim of my model has a different aspect ratio than 16/9, when I will infer on the hd video, will the model see distorted images ? Can this affect performances (my understanding is that it will)

  3. I know that with yolov4, TAO resize all input images during the augmentation process to be the equal to the input dim of the model (distorting the image if necessary). But if I have images with various resolutions and aspect ratios, my understanding is that I should beforehand have all my images have the same resolution as the model input dim, or the very least, the same aspect ratio so that my training images don’t get distorted. Is that correct ?

  4. If yes, If I have an image that is originally smaller (or have a dimention smaller) than the input dim, what would be the best thing to do ? upscale the image enouth so that I can crop a portion the size of the input dim (potentially cutting out part of the annotated object), or is it possible to add padding to the image ?


    To be fair both feels wrong, is there a better solution ? or should I just not include those images in my dataset ?

  1. Yes, it is possible to set a smaller output_width and output_height in the spec file. But need to make sure it is multiple of 32.
  2. Please refer to tao_tensorflow1_backend/nvidia_tao_tf1/cv/yolo_v4/scripts/inference.py at main · NVIDIA/tao_tensorflow1_backend · GitHub and tao_tensorflow1_backend/nvidia_tao_tf1/cv/common/inferencer/inferencer.py at main · NVIDIA/tao_tensorflow1_backend · GitHub, it can load an image while keeping aspect ratio or not.
  3. Refer to tao_tensorflow1_backend/nvidia_tao_tf1/cv/yolo_v4/dataio/data_sequence.py at main · NVIDIA/tao_tensorflow1_backend · GitHub, the augmentation pipeline will do mosaic/jitter/resize/random-crop/etc. The images will not get distorted.
  4. You can just need to set the training spec file and trigger training.
1 Like

Hello Morganh,

Thank you for those informations.
2. Does it keep the aspect ratio by default ? if not how do I enable it ?
3. So if I use the default spec file for training with my dataset, the images won’t be distorted ? even if the output_width and output_height in the augmentation_config are set to a different aspect ratio ?

Hi,
2) Yes, it does.
3) Yes, the images are not distorted.

1 Like

Perfect,
Thank you very much

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.