We now come to the final piece in our understanding of the Viola-Jones face detection framework. Before we move on to our discussions on cascaded classifiers, which researchers believe is the single most important contribution of the Viola-Jones framework, let us take a moment to recap where we stand in our understanding of the inner-workings of face detection.
Let's say we are given an input image (of a reasonable size) and asked to detect faces in it. Common sense will tell us that:
Faces may be present in any of the spatial locations within the image.
The actual size of the face would be a fraction of the total image size (in most cases).
So, we start with a fixed-size sub-window at one of the corners of our image. This sub-window will define the region that we are investigating at any moment for the presence/absence of faces. In the spirit of preceding point 1, we would ideally want to slide this window across the entire image (just like a filter is moved during the...