Book Image

Instant OpenCV for iOS

4 (1)
Book Image

Instant OpenCV for iOS

4 (1)

Overview of this book

Computer vision on mobile devices is becoming more and more popular. Personal gadgets are now powerful enough to process high-resolution images, stitch panoramas, and detect and track objects. OpenCV, with its decent performance and wide range of functionality, can be an extremely useful tool in the hands of iOS developers. Instant OpenCV for iOS is a practical guide that walks you through every important step for building a computer vision application for the iOS platform. It will help you to port your OpenCV code, profile and optimize it, and wrap it into a GUI application. Each recipe is accompanied by a sample project or an example that helps you focus on a particular aspect of the technology. Instant OpenCV for iOS starts by creating a simple iOS application and linking OpenCV before moving on to processing images and videos in real-time. It covers the major ways to retrieve images, process them, and view or export results. Special attention is also given to performance issues, as they greatly affect the user experience.Several computer vision projects will be considered throughout the book. These include a couple of photo filters that help you to print a postcard or add a retro effect to your images. Another one is a demonstration of the facial feature detection algorithm. In several time-critical cases, the processing speed is measured and optimized using ARM NEON and the Accelerate framework. OpenCV for iOS gives you all the information you need to build a high-performance computer vision application for iOS devices.
Table of Contents (7 chapters)

Detecting facial features (Advanced)


Many human-computer interaction (HCI) applications require knowledge about position of a face and facial features in a frame. We will learn how OpenCV can be used for detecting facial features. Detected faces are decorated in a way, as shown in the following screenshot:

Getting ready

The source code for this recipe is available in the Recipe15_DetectingFacialFeatures folder in the code bundle that accompanies this book. You can't use Simulator, as we're going to use camera in this recipe.

How to do it...

The following are the steps required to implement the application for this recipe:

  1. Add a new C++ class to our CvEffects library, called FaceAnimator, together with its resources.

  2. Implement the facial feature detection functionality.

  3. Add some animation, based on the position of detected facial features.

  4. Call this class from the video processing application.

Let's implement the described steps:

  1. First of all, add a new class with the following interface to the CvEffects static library project. You should also add three XML-files with cascade classifiers (lbpcascade_frontalface.xml, haarcascade_mcs_eyepair_big.xml, and haarcascade_mcs_mouth.xml), and two images that are going to be used for animation (glasses.png and mustache.png):

    class FaceAnimator
    {
    public:
        struct Parameters
        {
            cv::Mat glasses;
            cv::Mat mustache;
            cv::CascadeClassifier faceCascade;
            cv::CascadeClassifier eyesCascade;
            cv::CascadeClassifier mouthCascade;
        };
    
        FaceAnimator(Parameters params);
        virtual ~FaceAnimator() {};
    
        void detectAndAnimateFaces(cv::Mat& frame);
    
    protected:
        Parameters parameters_;
        cv::Mat maskOrig_;
        cv::Mat maskMust_;
        cv::Mat grayFrame_;
        
        void putImage(cv::Mat& frame, const cv::Mat& image,
                      const cv::Mat& alpha, cv::Rect face,
                      cv::Rect facialFeature, float shift);
        void PreprocessToGray(cv::Mat& frame);
    
        // Members needed for optimization with Accelerate Framework
        void PreprocessToGray_optimized(cv::Mat& frame);
        cv::Mat accBuffer1_;
        cv::Mat accBuffer2_;
    }; 
  2. Next, we need to implement the class's methods. In the following code snippet, we show the only the most important detectAndAnimateFaces method:

    static bool FaceSizeComparer(const Rect& r1, const Rect& r2)
    {
        return r1.area() > r2.area();
    }
    void FaceAnimator::detectAndAnimateFaces(cv::Mat& frame)
    {
        TS(Preprocessing);
        //PreprocessToGray(frame);
        PreprocessToGray_optimized(frame);
        TE(Preprocessing);
        
        // Detect faces
        TS(DetectFaces);
        std::vector<Rect> faces;
        parameters_.faceCascade.detectMultiScale(grayFrame_, faces, 1.1,
                                                  2, 0, Size(100, 100));
        TE(DetectFaces);
        printf("Detected %lu faces\n", faces.size());
    
        // Sort faces by size in descending order
        sort(faces.begin(), faces.end(), FaceSizeComparer);
    
        for ( size_t i = 0; i < faces.size(); i++ )
        {
            Mat faceROI = grayFrame_( faces[i] );
    
            std::vector<Rect> facialFeature;
            if (i % 2 == 0)
            {
                // Detect eyes
                Point origin(0, faces[i].height/4);
                Mat eyesArea = faceROI(Rect(origin,
                            Size(faces[i].width, faces[i].height/4)));
    
                TS(DetectEyes);
                parameters_.eyesCascade.detectMultiScale(eyesArea,
                    facialFeature, 1.1, 2, CV_HAAR_FIND_BIGGEST_OBJECT,
                    Size(faces[i].width * 0.55, faces[i].height * 0.13));
                TE(DetectEyes);
                
                if (facialFeature.size())
                {
                    TS(DrawGlasses);
                    putImage(frame, parameters_.glasses, maskOrig_,
                             faces[i], facialFeature[0] + origin, -0.1f);
                    TE(DrawGlasses);
                }
            }
            else
            {  
                // Detect mouth
                Point origin(0, faces[i].height/2);
                Mat mouthArea = faceROI(Rect(origin,
                    Size(faces[i].width, faces[i].height/2)));
    
                parameters_.mouthCascade.detectMultiScale(
                    mouthArea, facialFeature, 1.1, 2,
                    CV_HAAR_FIND_BIGGEST_OBJECT,
                    Size(faces[i].width * 0.2, faces[i].height * 0.13) );
                
                if (facialFeature.size())
                {
                    putImage(frame, parameters_.mustache, maskMust_,
                             faces[i], facialFeature[0] + origin, 0.3f);
                }
            }
        }
    }
  3. Now its time to use the FaceAnimator class in our application. First of all, set up the copying of the FaceAnimator.hpp public header file, so our application will be able to see the class. Then you should rebuild the library project. After that, you should add references to cascade files and images from the CvEffects project, as we did earlier.

  4. Now, FaceAnimator can be used from the Objective-C code, as we did for the RetroFilter class in the Applying effects to live video (Intermediate) recipe. The following is the declaration of our ViewController class.

    @interface ViewController : UIViewController<CvVideoCameraDelegate>
    {
        CvVideoCamera* videoCamera;
        BOOL isCapturing;
        
        FaceAnimator::Parameters parameters;
        cv::Ptr<FaceAnimator> faceAnimator;
    }
  5. We also need to load all the resources in the viewDidLoad method, then create a class instance in the startCaptureButtonPressed method, and apply processing in the processImage method. We don't show these methods, but they are almost identical to what we've written before for the RetroFilter class. You can build and run the application when all of the integration code is added.

How it works...

Let's consider how the detectAndAnimateFaces method works. You can see that the processing time of every step is measured, as the overall processing is quite expensive.

We are already familiar with detecting objects (and faces in particular) using OpenCV's CascadeClassifier class. You can see that we use a different cascade in this example, which is based on LBP-features (Local Binary Patterns). This cascade works several times faster than the Haar-based cascade and the quality doesn't differ much. And this performance difference is important, because we're going to process live video.

When the detection is completed, we sort the vector of detected faces by their size using the FaceSizeComparer function. The for loop is used to detect facial features within every face. We decided to detect eyes in every even face, and mouth in every odd face.

We use a couple of tricks to improve the quality and minimize the detection time. First of all, we limit the search area, so that eyes are detected on the upper half of the face rectangle, and the mouth in the lower half. This not only improves the performance, but also allows avoiding false detections. Secondly, we search only for the largest object using the CV_HAAR_FIND_BIGGEST_OBJECT flag. It stops the detection when the first object is found, so we don't waste our time searching for another pair of eyes or mouth in the same face rectangle. It is obvious that even if we find something, this should be a false detection. Finally, we control the minimal facial feature size. The following are empirically found minimal relative sizes for eyes and mouth:

Size(faces[i].width * 0.55, faces[i].height * 0.13) //eyes
Size(faces[i].width * 0.20, faces[i].height * 0.13) //mouth

Finally, we put some animation over the detected facial feature, using the alpha blending function from the previous recipes.

There's more...

This sample presents the very basic approach to facial feature detection. It can be significantly improved in both quality and speed. Let's consider some opportunities.

Performance

First of all, we need to detect performance bottlenecks and try to avoid them or optimize with NEON. In our example, it can be found that a cvtColor function takes a significant percentage of the processing time. It is a good candidate to be vectorized. Another candidate is alpha blending in the putImage function.

Tracking between detections

Another approach to optimize the performance is to run face and facial feature detection every k frames, and to run optical tracking between them. One can try to use the calcOpticalFlowPyrLK function on the points returned by the goodFeaturesToTrack method. If the goodFeaturesToTrack method also takes much time, we can cover the face rectangle with a simple regular grid of points. The median motion vector (after some filtering) can give us a hint about the new face position. Median-Flow tracker can be a good candidate for this task (http://bit.ly/3848_MedianFlowTracker).

Active Shape Model

One of limitations of the Cascade Classifier approach is that it returns only a bounding box, while some applications may need contour representation of a facial feature. There are some approaches that allow fitting a contour model of the entire face to an image. One of the most popular methods is Active Shape Model (ASM); several open-source implementations are also available.

There are also some other approaches; one of them was developed by Jason Saragih and is covered in detail in the book Mastering OpenCV with Practical Computer Vision Projects, Packt Publishing. The source code is available online at http://bit.ly/3848_FaceTracking.