In this post, it’s recommended that the image size should be multiple of 32.
I am not clear on three questions related to this recommendation:
Does violating this recommendation reduce the performance metrics of the model?
Does violating this recommendation change the computation time?
I am wondering, if I have picture of size 1024 x 688, where, the height, 688 = 32x21.5 ==> would it be wiser to resize the height to 32x21 or 32x22 if I beleive that both resizing work from performance point of view?
For 1024x688, you can set to 1024x672 or 1024x704. Usually the higher resolution can get higher mAP but lower fps. But the difference should be very small since 1024x672 is not much different from1024x704.
What is then wrong with 1024x688? Are the algorithms done in a way to take only multiples of 32? or it’s only a ‘recommentation’ for performance increase?
Usually it is due to network. For example, the YOLO network downsamples the input by 32, so it is needed to make sure the width and height is a multiple of 32.