Book Image

Mastering OpenCV 4 - Third Edition

By : Roy Shilkrot, David Millán Escrivá
Book Image

Mastering OpenCV 4 - Third Edition

By: Roy Shilkrot, David Millán Escrivá

Overview of this book

Mastering OpenCV, now in its third edition, targets computer vision engineers taking their first steps toward mastering OpenCV. Keeping the mathematical formulations to a solid but bare minimum, the book delivers complete projects from ideation to running code, targeting current hot topics in computer vision such as face recognition, landmark detection and pose estimation, and number recognition with deep convolutional networks. You’ll learn from experienced OpenCV experts how to implement computer vision products and projects both in academia and industry in a comfortable package. You’ll get acquainted with API functionality and gain insights into design choices in a complete computer vision project. You’ll also go beyond the basics of computer vision to implement solutions for complex image processing projects. By the end of the book, you will have created various working prototypes with the help of projects in the book and be well versed with the new features of OpenCV4.
Table of Contents (12 chapters)

Implementation of the skin color changer

Rather than detecting the skin color and then the region with that skin color, we can use OpenCV's floodFill() function, which is similar to the bucket fill tool in most image editing software. We know that the regions in the middle of the screen should be skin pixels (since we asked the user to put their face in the middle), so to change the whole face to have green skin, we can just apply a green flood fill on the center pixel, which will always color some parts of the face green. In reality, the color, saturation, and brightness are likely to be different in different parts of the face, so a flood fill will rarely cover all the skin pixels of a face unless the threshold is so low that it also covers unwanted pixels outside of the face. So instead of applying a single flood fill in the center of the image, let's apply a flood fill on six different points around the face that should be skin pixels.

A nice feature of OpenCV's floodFill() is that it can draw the flood fill in an external image rather than modify the input image. So, this feature can give us a mask image for adjusting the color of the skin pixels without necessarily changing the brightness or saturation, producing a more realistic image than if all the skin pixels became an identical green pixel (losing significant face detail).

Skin color changing does not work so well in the RGB color space, because you want to allow brightness to vary in the face but not allow skin color to vary much, and RGB does not separate brightness from the color. One solution is to use the HSV color space since it separates the brightness from the color (hue) as well as the colorfulness (Saturation). Unfortunately, HSV wraps the hue value around red, and since the skin is mostly red, it means that you need to work both with hue < 10% and hue > 90% since these are both red. So, instead we will use the Y'CrCb color space (the variant of YUV that is in OpenCV), since it separates brightness from color and only has a single range of values for typical skin color rather than two. Note that most cameras, images, and videos actually use some type of YUV as their color space before conversion to RGB, so in many cases you can get a YUV image free without converting it yourself.

Since we want our alien mode to look like a cartoon, we will apply the alien filter after the image has already been cartoonified. In other words, we have access to the shrunken color image produced by the bilateral filter, and access to the full-sized edge mask. Skin detection often works better at low resolutions, since it is the equivalent of analyzing the average value of each high-resolution pixel's neighbors (or the low-frequency signal instead of the high-frequency noisy signal). So, let's work at the same shrunken scale as the bilateral filter (half-width and half-height). Let's convert the painting image to YUV:

Mat yuv = Mat(smallSize, CV_8UC3);
cvtColor(smallImg, yuv, CV_BGR2YCrCb);

We also need to shrink the edge mask so it is on the same scale as the painting image. There is a complication with OpenCV's floodFill() function, when storing to a separate mask image, in that the mask should have a one-pixel border around the whole image, so if the input image is W x H pixels in size, then the separate mask image should be (W+2) x (H+2) pixels in size. But the floodFill() function also allows us to initialize the mask with edges that the flood fill algorithm will ensure it does not cross. Let's use this feature, in the hope that it helps prevent the flood fill from extending outside of the face. So, we need to provide two mask images: one is the edge mask of W x H in size, and the other image is the exact same edge mask but (W+2) x (H+2) in size because it should include a border around the image. It is possible to have multiple cv::Mat objects (or headers) referencing the same data, or even to have a cv::Mat object that references a sub-region of another cv::Mat image. So, instead of allocating two separate images and copying the edge mask pixels across, let's allocate a single mask image including the border, and create an extra cv::Mat header of W x H (which just references the region of interest in the flood fill mask without the border). In other words, there is just one array of pixels of size (W+2) x (H+2) but two cv::Mat objects, where one is referencing the whole (W+2) x (H+2) image and the other is referencing the W x H region in the middle of that image:

auto sw = smallSize.width;
auto sh = smallSize.height;
Mat mask, maskPlusBorder;
maskPlusBorder = Mat::zeros(sh+2, sw+2, CV_8UC1);
mask = maskPlusBorder(Rect(1,1,sw,sh));
// mask is now in maskPlusBorder.
resize(edges, mask, smallSize); // Put edges in both of them.

The edge mask (shown on the left of the following diagram) is full of both strong and weak edges, but we only want strong edges, so we will apply a binary threshold (resulting in the middle image in the following diagram). To join some gaps between edges, we will then combine the morphological operators dilate() and erode() to remove some gaps (also referred to as the close operator), resulting in the image on the right:

const int EDGES_THRESHOLD = 80;
threshold(mask, mask, EDGES_THRESHOLD, 255, THRESH_BINARY);
dilate(mask, mask, Mat());
erode(mask, mask, Mat());

We can see the result of applying thresholding and morphological operation in the following image, first image is the input edge map, second the thresholding filter, and last image is the dilate and erode morphological filters:

As mentioned earlier, we want to apply flood fills in numerous points around the face, to make sure we include the various colors and shades of the whole face. Let's choose six points around the nose, cheeks, and forehead, as shown on the left-hand side of the following screenshot. Note that these values are dependent on the face outline being drawn earlier:

auto const NUM_SKIN_POINTS = 6;
Point skinPts[NUM_SKIN_POINTS];
skinPts[0] = Point(sw/2, sh/2 - sh/6);
skinPts[1] = Point(sw/2 - sw/11, sh/2 - sh/6);
skinPts[2] = Point(sw/2 + sw/11, sh/2 - sh/6);
skinPts[3] = Point(sw/2, sh/2 + sh/16);
skinPts[4] = Point(sw/2 - sw/9, sh/2 + sh/16);
skinPts[5] = Point(sw/2 + sw/9, sh/2 + sh/16);

Now, we just need to find some good lower and upper bounds for the flood fill. Remember that this is being performed in the Y'CrCb color space, so we basically decide how much the brightness can vary, how much the red component can vary, and how much the blue component can vary. We want to allow the brightness to vary a lot, to include shadows as well as highlights and reflections, but we don't want the colors to vary much at all:

const int LOWER_Y = 60;
const int UPPER_Y = 80;
const int LOWER_Cr = 25;
const int UPPER_Cr = 15;
const int LOWER_Cb = 20;
const int UPPER_Cb = 15;
Scalar lowerDiff = Scalar(LOWER_Y, LOWER_Cr, LOWER_Cb);
Scalar upperDiff = Scalar(UPPER_Y, UPPER_Cr, UPPER_Cb);

We will use the floodFill() function with its default flags, except that we want to store to an external mask, so we must specify FLOODFILL_MASK_ONLY:

const int CONNECTED_COMPONENTS = 4; // To fill diagonally, use 8.
Mat edgeMask = mask.clone(); // Keep a copy of the edge mask.
// "maskPlusBorder" is initialized with edges to block floodFill().
for (int i = 0; i < NUM_SKIN_POINTS; i++) {
floodFill(yuv, maskPlusBorder, skinPts[i], Scalar(), NULL,
lowerDiff, upperDiff, flags);

The following image on the left-hand side shows the six flood fill locations (shown as circles), and the right-hand side of the image shows the external mask that is generated, where the skin is shown as gray and edges are shown as white. Note that the right-hand image was modified for this book so that skin pixels (of value 1) are clearly visible:

The mask image (shown on the right-hand side of the preceding image) now contains the following:

  • Pixels of value 255 for the edge pixels
  • Pixels of value 1 for the skin regions
  • Pixels of value 0 for the rest

Meanwhile, edgeMask just contains edge pixels (as value 255). So to get just the skin pixels, we can remove the edges from it:

mask -= edgeMask;

The mask variable now just contains 1s for skin pixels and 0s for non-skin pixels. To change the skin color and brightness of the original image, we can use the cv::add() function with the skin mask to increase the green component in the original BGR image:

auto Red = 0;
auto Green = 70;
auto Blue = 0;
add(smallImgBGR, CV_RGB(Red, Green, Blue), smallImgBGR, mask);

The following diagram shows the original image on the left and the final alien cartoon image on the right, where at least six parts of the face will now be green!

Notice that we have made the skin look green but also brighter (to look like an alien that glows in the dark). If you want to just change the skin color without making it brighter, you can use other color changing methods, such as adding 70 to green while subtracting 70 from red and blue, or convert to the HSV color space using cvtColor(src, dst, "CV_BGR2HSV_FULL") and adjust the hue and saturation.

Reducing the random pepper noise from the sketch image

Most of the tiny cameras in smartphones, Raspberry Pi Camera Modules, and some webcams have significant image noise. This is normally acceptable, but it has a big effect on our 5 x 5 Laplacian edge filter. The edge mask (shown as the sketch mode) will often have thousands of small blobs of black pixels called pepper noise, made of several black pixels next to each other on a white background. We are already using a median filter, which is usually strong enough to remove pepper noise, but in our case it may not be strong enough. Our edge mask is mostly a pure white background (value of 255) with some black edges (value of 0) and the dots of noise (also value of 0). We could use a standard closing morphological operator, but it will remove a lot of edges. So instead, we will apply a custom filter that removes small black regions that are surrounded completely by white pixels. This will remove a lot of noise while having little effect on actual edges.

We will scan the image for black pixels, and at each black pixel, we'll check the border of the 5 x 5 square around it to see if all the 5 x 5 border pixels are white. If they are all white, then we know we have a small island of black noise, so then we fill the whole block with white pixels to remove the black island. For simplicity in our 5 x 5 filter, we will ignore the two border pixels around the image and leave them as they are.

The following diagram shows the original image from an Android tablet on the left side, with a sketch mode in the center, showing small black dots of pepper noise and the result of our pepper noise removal shown on the right-hand side, where the skin looks cleaner:

The following code can be named the removePepperNoise() function to edit the image in place for simplicity:

void removePepperNoise(Mat &mask)
for (int y=2; y<mask.rows-2; y++) {
// Get access to each of the 5 rows near this pixel.
uchar *pUp2 = mask.ptr(y-2);
uchar *pUp1 = mask.ptr(y-1);
uchar *pThis = mask.ptr(y);
uchar *pDown1 = mask.ptr(y+1);
uchar *pDown2 = mask.ptr(y+2);

// Skip the first (and last) 2 pixels on each row.
pThis += 2;
pUp1 += 2;
pUp2 += 2;
pDown1 += 2;
pDown2 += 2;
for (auto x=2; x<mask.cols-2; x++) {
uchar value = *pThis; // Get pixel value (0 or 255).
// Check if it's a black pixel surrounded bywhite
// pixels (ie: whether it is an "island" of black).
if (value == 0) {
bool above, left, below, right, surroundings;
above = *(pUp2 - 2) && *(pUp2 - 1) && *(pUp2) && *(pUp2 + 1)
&& *(pUp2 + 2);
left = *(pUp1 - 2) && *(pThis - 2) && *(pDown1 - 2);
below = *(pDown2 - 2) && *(pDown2 - 1) && (pDown2) &&
(pDown2 + 1) && *(pDown2 + 2);
right = *(pUp1 + 2) && *(pThis + 2) && *(pDown1 + 2);
surroundings = above && left && below && right;
if (surroundings == true) {
// Fill the whole 5x5 block as white. Since we
// knowthe 5x5 borders are already white, we just
// need tofill the 3x3 inner region.
*(pUp1 - 1) = 255;
*(pUp1 + 0) = 255;
*(pUp1 + 1) = 255;
*(pThis - 1) = 255;
*(pThis + 0) = 255;
*(pThis + 1) = 255;
*(pDown1 - 1) = 255;
*(pDown1 + 0) = 255;
*(pDown1 + 1) = 255;
// Since we just covered the whole 5x5 block with
// white, we know the next 2 pixels won't be
// black,so skip the next 2 pixels on the right.
pThis += 2;
pUp1 += 2;
pUp2 += 2;
pDown1 += 2;
pDown2 += 2;
// Move to the next pixel on the right.

That's all! Run the app in the different modes until you are ready to port it to the embedded device!