With our fresh, new combined shape and texture model, we have found a nice way to describe how a face could change not only in shape, but also in appearance. Now, we want to find which set of *p* shape and *λ* appearance parameters will bring our model as close as possible to a given input image *I(x)*. We could naturally calculate the error between our instantiated model and the given input image in the coordinate frame of *I(x)*, or map the points back to the base appearance and calculate the difference there. We are going to use the latter approach. This way, we want to minimize the following function:

In the preceding equation, *S0* denotes the set of pixels *x* is equal to *(x,y)T* that lie inside the AAMs base mesh, *A0(x)* is our base mesh texture, *Ai(x)* is appearance images from PCA, and *W(x;p)* is the warp that takes pixels from the input image back to the base mesh frame.

Several approaches have been proposed for this minimization through years of studying. The first idea...