Book Image

Mastering OpenCV 3 - Second Edition

By : Jason Saragih
Book Image

Mastering OpenCV 3 - Second Edition

By: Jason Saragih

Overview of this book

As we become more capable of handling data in every kind, we are becoming more reliant on visual input and what we can do with those self-driving cars, face recognition, and even augmented reality applications and games. This is all powered by Computer Vision. This book will put you straight to work in creating powerful and unique computer vision applications. Each chapter is structured around a central project and deep dives into an important aspect of OpenCV such as facial recognition, image target tracking, making augmented reality applications, the 3D visualization framework, and machine learning. You’ll learn how to make AI that can remember and use neural networks to help your applications learn. By the end of the book, you will have created various working prototypes with the projects in the book and will be well versed with the new features of OpenCV3.
Table of Contents (14 chapters)
Title Page
Mastering OpenCV 3 Second Edition
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Main camera processing loop for a desktop app


If you want to display a GUI window on the screen using OpenCV, you call the cv::namedWindow() function and then cv::imshow()function for each image, but you must also call cv::waitKey() once per frame, otherwise your windows will not update at all! Calling cv::waitKey(0) waits forever until the user hits a key in the window, but a positive number such as waitKey(20) or higher will wait for at least that many milliseconds.

Put this main loop in the main.cpp file, as the base of your real-time camera app:

     while (true) { 
      // Grab the next camera frame. 
      cv::Mat cameraFrame; 
      camera>>cameraFrame; 
      if (cameraFrame.empty()) { 
        std::cerr<<"ERROR: Couldn't grab a camera frame."<< 
        std::endl; 
        exit(1); 
      } 
      // Create a blank output image, that we will draw onto. 
      cv::Mat displayedFrame(cameraFrame.size(), cv::CV_8UC3); 

      // Run the cartoonifier filter on the camera frame. 
      cartoonifyImage(cameraFrame, displayedFrame); 

      // Display the processed image onto the screen. 
      imshow("Cartoonifier", displayedFrame); 

      // IMPORTANT: Wait for atleast 20 milliseconds, 
      // so that the image can be displayed on the screen! 
      // Also checks if a key was pressed in the GUI window. 
      // Note that it should be a "char" to support Linux. 
      char keypress = cv::waitKey(20);  // Needed to see anything! 
      if (keypress == 27) {   // Escape Key 
        // Quit the program! 
        break; 
      } 
    }//end while

 

Generating a black and white sketch

To obtain a sketch (black and white drawing) of the camera frame, we will use an edge detection filter, whereas to obtain a color painting, we will use an edge preserving filter (Bilateral filter) to further smoothen the flat regions while keeping edges intact. By overlaying the sketch drawing on top of the color painting, we obtain a cartoon effect, as shown earlier in the screenshot of the final app.

There are many different edge detection filters, such as Sobel, Scharr, Laplacian filters, or a Canny edge detector. We will use a Laplacian edge filter since it produces edges that look most similar to hand sketches compared to Sobel or Scharr, and are quite consistent compared to a Canny edge detector, which produces very clean line drawings but is affected more by random noise in the camera frames and therefore the line drawings would often change drastically between frames.

Nevertheless, we still need to reduce the noise in the image before we use a Laplacian edge filter. We will use a Median filter because it is good at removing noise while keeping edges sharp, but is not as slow as a Bilateral filter. Since Laplacian filters use grayscale images, we must convert from OpenCV's default BGR format to grayscale. In your empty cartoon.cpp file, put this code on the top so you can access OpenCV and STD C++ templates without typing cv:: and std:: everywhere:

    // Include OpenCV's C++ Interface 
    #include "opencv2/opencv.hpp" 

    using namespace cv; 
    using namespace std;

Put this and all remaining code in a cartoonifyImage() function in your cartoon.cpp file:

    Mat gray; 
    cvtColor(srcColor, gray, CV_BGR2GRAY); 
    const int MEDIAN_BLUR_FILTER_SIZE = 7; 
    medianBlur(gray, gray, MEDIAN_BLUR_FILTER_SIZE); 
    Mat edges; 
    const int LAPLACIAN_FILTER_SIZE = 5; 
    Laplacian(gray, edges, CV_8U, LAPLACIAN_FILTER_SIZE);

The Laplacian filter produces edges with varying brightness, so to make the edges look more like a sketch, we apply a binary threshold to make the edges either white or black:

    Mat mask; 
    const int EDGES_THRESHOLD = 80; 
    threshold(edges, mask, EDGES_THRESHOLD, 255, THRESH_BINARY_INV);

In the following figure, you see the original image (to the left) and the generated edge mask (to the right) that looks similar to a sketch drawing. After we generate a color painting (explained later), we also put this edge mask on top to have black line drawings:

Generating a color painting and a cartoon

A strong Bilateral filter smoothens flat regions while keeping edges sharp; and therefore, is great as an automatic cartoonifier or painting filter, except that it is extremely slow (that is, measured in seconds or even minutes, rather than milliseconds!). Therefore, we will use some tricks to obtain a nice cartoonifier, while still running in acceptable speed. The most important trick we can use is that we can perform Bilateral filtering at a lower resolution and it will still have a similar effect as a full resolution, but run much faster. Lets reduce the total number of pixels by four (for example, half width and half height):

    Size size = srcColor.size(); 
    Size smallSize; 
    smallSize.width = size.width/2; 
    smallSize.height = size.height/2; 
    Mat smallImg = Mat(smallSize, CV_8UC3); 
    resize(srcColor, smallImg, smallSize, 0,0, INTER_LINEAR);

Rather than applying a large Bilateral filter, we will apply many small Bilateral filters, to produce a strong cartoon effect in less time. We will truncate the filter (see the following figure) so that instead of performing a whole filter (for example, a filter size of 21x21, when the bell curve is 21 pixels wide), it just uses the minimum filter size needed for a convincing result (for example, with a filter size of just 9x9 even if the bell curve is 21 pixels wide). This truncated filter will apply the major part of the filter (gray area) without wasting time on the minor part of the filter (white area under the curve), so it will run several times faster:

Therefore, we have four parameters that control the Bilateral filter: color strength, positional strength, size, and repetition count. We need a temp Mat since the bilateralFilter()function can't overwrite its input (referred to as in-place processing), but we can apply one filter storing a temp Mat and another filter storing back the input:

    Mat tmp = Mat(smallSize, CV_8UC3); 
    int repetitions = 7;  // Repetitions for strong cartoon effect. 
    for (int i=0; i<repetitions; i++) { 
      int ksize = 9;     // Filter size. Has large effect on speed.  
      double sigmaColor = 9;    // Filter color strength. 
      double sigmaSpace = 7;    // Spatial strength. Affects speed. 
      bilateralFilter(smallImg, tmp, ksize, sigmaColor, sigmaSpace);

      bilateralFilter(tmp, smallImg, ksize, sigmaColor, sigmaSpace);
    }

Remember that this was applied to the shrunken image, so we need to expand the image back to the original size. Then we can overlay the edge mask that we found earlier. To overlay the edge mask sketch onto the Bilateral filter painting (left side of the following figure), we can start with a black background and copy the painting pixels that aren't edges in the sketch mask:

    Mat bigImg; 
    resize(smallImg, bigImg, size, 0,0, INTER_LINEAR); 
    dst.setTo(0); 
    bigImg.copyTo(dst, mask);

The result is a cartoon version of the original photo, as shown on the right side of the following figure, where the sketch mask is overlaid on the painting:

Generating an evil mode using edge filters

Cartoons and comics always have both good and bad characters. With the right combination of edge filters, a scary image can be generated from the most innocent looking people! The trick is to use a small-edge filter that will find many edges all over the image, then merge the edges using a small Median filter.

We will perform this on a grayscale image with some noise reduction, so the preceding code for converting the original image to grayscale and applying a 7x7 Median filter should still be used (the first image in the following figure shows the output of the grayscale Median blur). Instead of following it with a Laplacian filter and Binary threshold, we can get a more scary look if we apply a 3x3 Scharr gradient filter along x and y (second image in the figure), then a binary threshold with a very low cutoff (third image in the figure),and a 3x3 Median blur, producing the final evil mask (fourth image in the figure):

    Mat gray;
    cvtColor(srcColor, gray, CV_BGR2GRAY);
    const int MEDIAN_BLUR_FILTER_SIZE = 7;
    medianBlur(gray, gray, MEDIAN_BLUR_FILTER_SIZE);
    Mat edges, edges2;
    Scharr(srcGray, edges, CV_8U, 1, 0);
    Scharr(srcGray, edges2, CV_8U, 1, 0, -1);
    edges += edges2;
    // Combine the x & y edges together.
    const int EVIL_EDGE_THRESHOLD = 12
    threshold(edges, mask, EVIL_EDGE_THRESHOLD, 255,
    THRESH_BINARY_INV);
    medianBlur(mask, mask, 3)

Now that we have an evil mask, we can overlay this mask onto the cartoonified painting image like we did with the regular sketch edge mask. The final result is shown on the right side of the following figure:

Generating an alien mode using skin detection

Now that we have a sketch mode, a cartoon mode (painting + sketch mask), and an evil mode (painting + evil mask), for fun, let's try something more complex: an alien mode, by detecting the skin regions of the face and then changing the skin color to green.

Skin detection algorithm

There are many different techniques used for detecting skin regions, from simple color thresholds using RGB (Red-Green-Blue), HSV (Hue-Saturation-Brightness) values, or color histogram calculation and re-projection, to complex machine-learning algorithms of mixture models that need camera calibration in the CIELab color-space and offline training with many sample faces, and so on. But even the complex methods don't necessarily work robustly across various camera and lighting conditions and skin types. Since we want our skin detection to run on an embedded device, without any calibration or training, and we are just using skin detection for a fun image filter, it is sufficient for us to use a simple skin detection method. However, the color responses from the tiny camera sensor in the Raspberry Pi Camera Module tend to vary significantly, and we want to support skin detection for people of any skin color but without any calibration, so we need something more robust than simple color thresholds.

For example, a simple HSV skin detector can treat any pixel as skin if its hue color is fairly red, and saturation is fairly high but not extremely high, and its brightness is not too dark or extremely bright. But cameras in mobile phones or Raspberry Pi Camera Modules often have bad white balancing, therefore a person's skin might look slightly blue instead of red, and so on, and this would be a major problem for simple HSV thresholding.

A more robust solution is to perform face detection with a Haar or LBP cascade classifier (shown in Chapter 6 , Face Recognition using Eigenfaces or Fisherfaces), then look at the range of colors for the pixels in the middle of the detected face, since you know that those pixels should be skin pixels of the actual person. You could then scan the whole image or nearby region for pixels of a similar color as the center of the face. This has the advantage that it is very likely to find at least some of the true skin region of any detected person, no matter what their skin color is or even if their skin appears somewhat blueish or redish in the camera image.

Unfortunately, face detection using cascade classifiers is quite slow on current embedded devices, so that method might be less ideal for some real-time embedded applications. On the other hand, we can take advantage of the fact that for mobile apps and some embedded systems, it can be expected that the user will be facing the camera directly from a very close distance, so it can be reasonable to ask the user to place their face at a specific location and distance, rather than try to detect the location and size of their face. This is the basis of many mobile phone apps, where the app asks the user to place their face at a certain position or perhaps to manually drag points on the screen to show where the corners of their face are in a photo. So let's simply draw the outline of a face in the center of the screen, and ask the user to move their face to the shown position and size.

Showing the user where to put their face

When the alien mode is first started, we will draw the face outline on top of the camera frame so the user knows where to put their face. We will draw a big ellipse covering 70% of the image height, with a fixed aspect ratio of 0.72, so that the face will not become too skinny or fat depending on the aspect ratio of the camera:

    // Draw the color face onto a black background. 
    Mat faceOutline = Mat::zeros(size, CV_8UC3); 
    Scalar color = CV_RGB(255,255,0);    // Yellow. 
    int thickness = 4; 

    // Use 70% of the screen height as the face height. 
    int sw = size.width; 
    int sh = size.height; 
    int faceH = sh/2 * 70/100;  // "faceH" is radius of the ellipse. 

    // Scale the width to be the same nice shape for any screen width.   
    int faceW = faceH * 72/100; 
    // Draw the face outline. 
    ellipse(faceOutline, Point(sw/2, sh/2), Size(faceW, faceH), 
        0, 0, 360, color, thickness, CV_AA);

To make it more obvious that it is a face, let's also draw two eye outlines. Rather than drawing an eye as an ellipse, we can give it a bit more realism (see the following figure) by drawing a truncated ellipse for the top of the eye and a truncated ellipse for the bottom of the eye, because we can specify the start and end angles when drawing with the ellipse() function:

    // Draw the eye outlines, as 2 arcs per eye. 
    int eyeW = faceW * 23/100; 
    int eyeH = faceH * 11/100; 
    int eyeX = faceW * 48/100; 
    int eyeY = faceH * 13/100; 
    Size eyeSize = Size(eyeW, eyeH); 

    // Set the angle and shift for the eye half ellipses. 
    int eyeA = 15; // angle in degrees. 
    int eyeYshift = 11; 

    // Draw the top of the right eye. 
    ellipse(faceOutline, Point(sw/2 - eyeX, sh/2 -eyeY), 
    eyeSize, 0, 180+eyeA, 360-eyeA, color, thickness, CV_AA); 

    // Draw the bottom of the right eye. 
    ellipse(faceOutline, Point(sw/2 - eyeX, sh/2 - eyeY-eyeYshift), 
    eyeSize, 0, 0+eyeA, 180-eyeA, color, thickness, CV_AA); 

    // Draw the top of the left eye. 
    ellipse(faceOutline, Point(sw/2 + eyeX, sh/2 - eyeY), 
    eyeSize, 0, 180+eyeA, 360-eyeA, color, thickness, CV_AA); 

    // Draw the bottom of the left eye. 
    ellipse(faceOutline, Point(sw/2 + eyeX, sh/2 - eyeY-eyeYshift), 
      eyeSize, 0, 0+eyeA, 180-eyeA, color, thickness, CV_AA);

We can do the same to draw the bottom lip of the mouth:

    // Draw the bottom lip of the mouth. 
    int mouthY = faceH * 48/100; 
    int mouthW = faceW * 45/100; 
    int mouthH = faceH * 6/100; 
    ellipse(faceOutline, Point(sw/2, sh/2 + mouthY), Size(mouthW, 
      mouthH), 0, 0, 180, color, thickness, CV_AA);

To make it even more obvious that the user should put their face where shown, let's write a message on the screen!

    // Draw anti-aliased text. 
    int fontFace = FONT_HERSHEY_COMPLEX; 
    float fontScale = 1.0f; 
    int fontThickness = 2; 
    char *szMsg = "Put your face here"; 
    putText(faceOutline, szMsg, Point(sw * 23/100, sh * 10/100), 
    fontFace, fontScale, color, fontThickness, CV_AA);

Now that we have the face outline drawn, we can overlay it onto the displayed image by using alpha blending, to combine the cartoonified image with this drawn outline:

    addWeighted(dst, 1.0, faceOutline, 0.7, 0, dst, CV_8UC3);

This results in the outline in the following figure, showing the user where to put their face, so we don't have to detect the face location: