I want to do training on big images that are roughly 8000x6000 pixels.
Is there a way I can train a model in a tiling fashion in TAO (Something similar to this paper)?
If no, i.e. tiling is not granted by TAO, and I would need to split the images so that I don’t resize and loose details:
- Apart from the network’s default input size, is there some other constraints I should take into account? I heard that images size should be multiples on 16.
Different networks expect different input size. Usually it is mentioned in the TAO user guide. May I know that which network you are going to use to train?
Eventually, all object detection models are possible candidates. For now, I will start using detectnet_v2.
What makes me worried is that I need to produce images with different sizes for different networks (which means I need as many space as the networks I want to run). That’s why I was wondering whether I can use the big size images (8000x6000) but tile them during the image loading in TAO.
Currently, tiling and merging are not granted in the training of TAO networks . You can keep the original images without resizing and then feed into network for training. Besides detectnet_v2, you can also use YOLOv4_tiny. The input size is expected to be multiples of 32. For your case, you can set 8000x5984 in the training spec file.
So only yolov4_tiny and detectnet_v2 accepts such big image sizes?
No, all the networks can accept. I just give an example of yolov4_tiny which can have a balance between fps and mAP.