Image preprocessing in Hello AI World

In “Collecting your own Classification Datasets”,

In train.py, normalize RGB first.
Train data are cropped randomly with a random aspect ratio. In addition to that, it flips randomly.

#transforms.Resize(224),
transforms.RandomResizedCrop(args.resolution),
transforms.RandomHorizontalFlip(),

Val data is resized to 256 on the short side with fixed aspect ratio and cropped on the long side.

transforms.Resize(256),
transforms.CenterCrop(args.resolution),

Are these perceptions correct?

When using imagenet.py --model=$NET/resnet18.onnx,
what kind of image preprocessing is being done?
I understand that there is RGB normalization.

I want to train and run using the entire image with varying aspect ratio without cropping.
How can I do this?

Thanks.

This training code is forked from the PyTorch ImageNet classification example, so it is being done the same way as there. If you don’t want to do those, just make this the transforms used for the datasets:

transforms.Compose([
            transforms.Resize(224),
            transforms.ToTensor(),
            normalize,
        ]))

Although I recall finding better results keeping it the way that PyTorch does it with the cropping.

The pre-processing is applying the same normalization coefficients as PyTorch does, and then converting the data layout from RGB to NCHW. This is where the pre-processing is done in jetson-inference for classifiers:

https://github.com/dusty-nv/jetson-inference/blob/86776674121f071453b64cc5754e628cb6a6d32c/c/imageNet.cpp#L433