Having obtained the facial landmarks, we can attempt to find the direction of the face. The 2D face landmark points essentially conform to the shape of the head. So, given a 3D model of a generic human head, we can find approximate corresponding 3D points for a number of facial landmarks, as shown in the following photo:
From these 2D–3D correspondences, we can calculate 3D pose (rotation and translation) of the head, with respect to the camera, by way of the Point-n-Perspective (PnP) algorithm. The details of the algorithm and object pose detection are beyond the scope of this chapter; however, we can quickly rationalize why just a handful of 2D–3D point correspondences are suffice to achieve this. The camera that took the preceding picture has a rigid transformation, meaning it has moved a certain distance from the object, as well as rotated somewhat, with respect to it. In very broad terms, we can then write the relationship...