Face XY Project

Face XY Project is a project from dli course.

def live(state_widget, model, camera, prediction_widget):
    global dataset
    while state_widget.value == 'live':
        image = camera.value
        preprocessed = preprocess(image)
        output = model(preprocessed).detach().cpu().numpy().flatten()
        category_index = dataset.categories.index(category_widget.value)
        x = output[2 * category_index]
        y = output[2 * category_index + 1]
        
        x = int(camera.width * (x / 2.0 + 0.5))
        y = int(camera.height * (y / 2.0 + 0.5))
        
        prediction = image.copy()
        prediction = cv2.circle(prediction, (x, y), 8, (255, 0, 0), 3)
        prediction_widget.value = bgr8_to_jpeg(prediction)

What does “x / 2.0 + 0.5” mean?
Why make such a transformation?

Hi plainji, the network outputs values in coordinate space [-1, 1]. This function transforms those values back into image space (width, height) so that a circle can be overplayed on the image indicating the detected position of the face.

got it. Thanks~