In one of the previous articles, we learned how to run camera streaming on ROS2 foxy with rospy. If you are interested in knowing how it is done, we recommend you check it out using the link below:
All you need to know about how to install ROS2 on Jetson Orin – using NileCAM81.
It will be a good precursor to this article where we dive deep into running a face detection algorithm on ROS2 foxy.
For this application, you need to download this file and save it in the location where the publisher and subscriber script are placed.
Figure 1 – Haar-Cascade Face Detection Algorithm
The Haar-Cascade Face Detection Algorithm is a sliding-window type algorithm that detects objects based on its features.
Haar Face Features
The Haar-Cascade model employs different types of feature recognition such as the size and location of certain facial features. To be specific, nose bridge, mouth line, and eyes – with the eye region being darker than the upper-cheek region and the nose bridge region being brighter than the eye region.
Intel’s ‘haarcascade_frontalface_default.xml’
This XML file contains a pre-trained model that was created through extensive training and uploaded by Rainer Lienhart on behalf of Intel in 2000. Rainer’s model makes use of the Adaptive Boosting Algorithm (AdaBoost) in order to yield better results and accuracy.
How it works
In the model, the haar features in the image given below are utilised. They resemble our convolutional kernel. Each feature is a single value that is obtained by deducting the sum of the pixels under the white and black rectangles.
Figure 2 – Haar features of Rainer’s model
Now, a wide range of characteristics are calculated using all feasible sizes and positions for each kernel. Simply consider the amount of calculation required in a 24×24 window alone – there are more than 160000 characteristics. We must determine the total number of pixels under the white and black rectangles for each feature computation. They devised the integrated pictures to address this. It reduces the operation requiring more than four pixels to the sum of pixels computation, regardless of how many pixels there may be. This helps to accelerate the entire process.