A 3 Step Process :
We first need to look into Face Detection before Recognition. Face Detection involves finding the position of the faces in a given image. Once the faces are found they will be recognized by matching them to faces in a database using face templates, which we will discuss subsequently.
So face detection only gives us the answer to the question – “Where are the faces in the image?”. Such algorithms are also used in digital cameras along with smile detection algorithms. The resulting image looks somewhat like this:
The output image has the faces marked with a green square outline. You must have noticed Facebook introduced Face Detection which they call their Auto-Tagging feature wherein, when you batch upload images, your friends are already tagged-
Algorithm – The method described here is called The Viola-Jones Method for Face Detection-or rather, for object detection. In general, the algorithm can be used to find any desired object-of-interest which in our case, happens to be face.
Note that there are many different algorithms to do the same. This is just one of them – A popular, simple and efficient one.
Viola-Jones method uses detectors to find faces in an image. A detector exploits the basic characteristics of faces in general to detect the faces in an image. Assume that you take a rectangular image of only your eyes. Then you take another image of the are below your eyes – the upper region of the cheeks. Then, if we compare the two images in terms of intensities (assume that the images are black and white – ie – they consists only of variations of grey from white to black). We find that the sum of intensities of all pixels of the first image is greater than the sum of intensities of all pixels of the second image.
That is, sum of intensities of pixels of the eye region is greater than the sum of intensities of the cheek region. This fact is utilized by the first Viola Jones Detector. We use a detector which has two rectangles – the upper has a higher sum of intensities than the lower one. We then move this detector on the image horizontally and vertically until it fits in on the image.
That is, it finds a region where the intensity of the above rectangle is greater than the lower rectangle. This place where it fits in is a probable face. This process can be explained below. The first step before using our detector is to convert the image to black and white:
We then apply the detector explained above to find a possible face. If it finds the area that fits in the intensities of the rectangle, then the output for the detector is 1, else the output is 0.
Like the above detector, the Viola Jones method uses a number of different detectors. For example, another detector uses the fact that the bridge of the nose if lighter than the eyes. That is, the sum of intensities of the eyes will be greater than the area between the eyes.
Again, if the face found by the first detector gives an output 1 and the second detector also gives an output 1 for the same image face sub-region, only then the sub-region goes to the next detector.
This process continues for a number of detectors and the image sub-region which gives an output = 1 for all the detectors is then classified as a face. If at any stage (ANY DETECTOR), the output is 0, the image sub-region is classified as “not a face”. This makes Viola Jones an efficient algorithm for face detection.
In the past, there have been news of face recognition not working for black people. And that the companies were being racist. But perhaps, they might be using the Viola Jones method and the sum of intensities of the black and white images of black people could not be detected by the Viola Jones detector. So, it might be a purely technical reason (a limitation of the algorithm) and not racism.
Once we have detected the face in the image, face recognition is just finding the face template that matches with the found face. Face recognition uses pre-defined face templates for a database of people. They define major nodal points on the face and the distances between them – since, these distances are different for different people. For example, the distance between the eyes, etc.
An example of a template created using such distances on the face is shown in the image. The algorithm for face recognition searches the database of people’s face templates with the image under consideration. If a template exists in the database with similar nodal distances as in the image, then the face is a match for the person in the database. This process can be summarized as follows:
Face recognition is used in Virtual Reality systems like Microsoft Kinect to recognize users and also to create avatars of the users apart from other applications in Security (Biometrics), etc.
The next component of Virtual Reality Systems is Motion Detection/Motion Capture.