In the preprocessing part of augmentation section, there are cropping options like crop_right, crop_bottom etc, which could take values from 0-input image width/height according to the tlt document, however, when I tried cropping sizes different from output_image_width/height, I always got error information like this:
ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(64, 128, 26, 26), (64, 256, 25, 25)]
Therefore, I’m confused bout this cropping size, do they necessarily have the same size as output_image_width/height?
Even for the crop_size? In the documentation, it’s mentioned that the augmentation module will finally crop or pad the image to fit the output_image_size, in my opinion, the crop_size doesn’t influence what’s given to the network. And for yolo, I always set 416 * 416 as output_image_size.