Jetson Nano Classification utils - preprocess() - purpose of this function

I am new to ML/DL in general and have a fairly basic question.

The Jetson nano comes with a sample - “classification_interactive.ipynb”. In the “Live Execution” stage of the setup, there is a call to a local package called utils that exports one function - preprocess().

I understand that this function converts a PIL/numpy image to a tensor and assigns it to the cuda(GPU) device. But before it returns the image, it performs one more operation on the tensor:

image.sub_(mean[:, None, None]).div_(std[:, None, None])

What is the purpose of subtracting the mean and dividing the Std.Deviation from the tensor?

The library code is:

import torch
import torchvision.transforms as transforms
import torch.nn.functional as F
import cv2
import PIL.Image
import numpy as np

mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()
std = torch.Tensor([0.229, 0.224, 0.225]).cuda()

def preprocess(image):
device = torch.device(‘cuda’)
image = PIL.Image.fromarray(image)
image = transforms.functional.to_tensor(image).to(device)
image.sub_(mean[:, None, None]).div_(std[:, None, None])
return image[None, …]

Hi,

In general, preprocess handle the difference between camera input and deep learning model input.

For example,
A camera input is YUV 4:2:0 + INT8 + HWC format.
A network input is RGB + Float + CHW format.
Then preprocess will convert the image from YUV into RGB, update the data format and rearrange the order.

Thanks.

1 Like

@AastaLLL - Thank you, that cleared a lot of my questions. Stay Safe!

Stay safe~