Learning OpenCV 3 Application Development

Learning OpenCV 3 Application Development

By : Samyak Datta

Buy this Book

Learning OpenCV 3 Application Development

By: Samyak Datta

Buy this Book

Overview of this book

Computer vision and machine learning concepts are frequently used in practical computer vision based projects. If you’re a novice, this book provides the steps to build and deploy an end-to-end application in the domain of computer vision using OpenCV/C++. At the outset, we explain how to install OpenCV and demonstrate how to run some simple programs. You will start with images (the building blocks of image processing applications), and see how they are stored and processed by OpenCV. You’ll get comfortable with OpenCV-specific jargon (Mat Point, Scalar, and more), and get to know how to traverse images and perform basic pixel-wise operations. Building upon this, we introduce slightly more advanced image processing concepts such as filtering, thresholding, and edge detection. In the latter parts, the book touches upon more complex and ubiquitous concepts such as face detection (using Haar cascade classifiers), interest point detection algorithms, and feature descriptors. You will now begin to appreciate the true power of the library in how it reduces mathematically non-trivial algorithms to a single line of code! The concluding sections touch upon OpenCV’s Machine Learning module. You will witness not only how OpenCV helps you pre-process and extract features from images that are relevant to the problems you are trying to solve, but also how to use Machine Learning algorithms that work on these features to make intelligent predictions from visual data!

Learning OpenCV 3 Application Development

Credits

About the Author

About the Reviewer

www.PacktPub.com

Preface

Free Chapter

Laying the Foundation

Digital image basics

Introduction to the Mat class

Exploring the Mat class: loading images

Exploring the Mat class - declaring Mat objects

Digging inside Mat objects

Traversing Mat objects

Image enhancement

Lookup tables

Linear transformations

Logarithmic transformations

Summary

Image Filtering

Neighborhood of a pixel

Image averaging

Image filters

Image averaging in OpenCV

Blurring an image in OpenCV

Gaussian smoothing

Gaussian function and Gaussian filtering

Gaussian filtering in OpenCV

Using your own filters in OpenCV

Image noise

Vignetting

Implementing Vignetting in OpenCV

Summary

Image Thresholding

Binary images

Image thresholding basics

Image thresholding in OpenCV

Types of simple image thresholding

Adaptive thresholding

Morphological operations

Erosion and dilation

Erosion and dilation in OpenCV

Summary

Image Histograms

The basics of histograms

Histograms in OpenCV

Plotting histograms in OpenCV

Color histograms in OpenCV

Multidimensional histograms in OpenCV

Summary

Image Derivatives and Edge Detection

Image derivatives

Image derivatives in two dimensions

Visualizing image derivatives with OpenCV

The Sobel derivative filter

From derivatives to edges

The Sobel detector - a basic framework for edge detection

The Canny edge detector

Image noise and edge detection

Laplacian - yet another edge detection technique

Blur detection using OpenCV

Summary

Face Detection Using OpenCV

Image classification systems

Face detection in OpenCV

Gender classification

Working with real datasets

Summary

Affine Transformations and Face Alignment

Exploring the dataset

Face alignment - the first step in facial analysis

Rotating faces

Image cropping -- basics

Image cropping for face alignment

Face alignment - the complete pipeline

Summary

Feature Descriptors in OpenCV

Introduction to the local binary pattern

A basic implementation of LBP

Variants of LBP

What does LBP capture?

Applying LBP to aligned facial images

A complete implementation of LBP

Putting it all together - the main() function

Summary

Machine Learning with OpenCV

What is machine learning

Supervised and unsupervised learning

Revisiting the image classification framework

k-means clustering - the basics

k-nearest neighbors classifier - introduction

Support vector machines (SVMs) - introduction

Non-linear SVMs

Using an SVM as a gender classifier

Overfitting

Cross-validation

Common evaluation metrics

The P-R curve

Some qualitative results

Summary

Command-line Arguments in C++

Introduction to command-line arguments

Summary

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Linear transformations

As mentioned previously, we will be discussing two broad categories of grayscale transformations: linear and logarithmic. We will start with linear transformations first.

When it comes to grayscale transformations, there are broadly two types of transformations that are widely discussed:

Identity
Negative transformation

In theory, you can make up as many arbitrary linear transformations that you want, but for the purpose of this book, we will restrict ourselves to just the two.

Identity transformation

The identity transformation maps each input pixel to itself in the output. In other words:

T(r)=r

Obviously, this does nothing exciting. In fact, this transformation doesn't do anything at all! The output image is the same as the input image, because every pixel gets mapped to itself in the transformation. Nevertheless, we discuss it here for the sake of completeness.

Implementing a lookup table for identity transformations shouldn't be a hassle at all:

vector getIdentityLUT() { 
  vector<uchar> LUT(256, 0); 
  for (int i = 0; i < 256; ++i) 
    LUT[i] = (uchar)i; 
  return LUT; 
}

The first line of the function declares and initializes the C++ vector that is going to serve as our lookup table. This same vector is then computed and returned by the function. We had discussed earlier (while talking about lookup tables) that the size of LUT will, in practically all cases, be 256 for the 256 different intensity values in a grayscale image. The for loop traverses over LUT and encodes the transformation. Notice that LUT[i]=i will map every input pixel to itself, thereby implementing the identity transformation.

As stated earlier, one of the secondary benefits of using a lookup table is that it modularizes your code and makes it cleaner. The preceding snippet that we showed for identity transformations only computes and returns the lookup table. You can use a matrix traversal method after this to actually apply the transformation to all the pixels of an image. In fact, we demonstrated this framework in our section on Lookup Tables.

void processImage(Mat& I) { 
  vector<uchar> LUT = getLUT(); 
  for (int i = 0; i < I.rows; ++i) { 
    for (int j = 0; j < I.cols; ++j) 
      I.at<uchar>(i, j) = LUT[I.at<uchar>(i, j)]; 
  } 
} 
 
void main() { 
  Mat image = imread("lena.jpg", IMREAD_GRAYSCALE); 
  Mat processed_image = image.clone(); 
  processImage(processed_image); 
 
  imshow("Input image", image); 
  imshow("Processed Image", processed_image); 
  waitKey(0); 
   
return 0; 
}

Note that while invoking the processImage() method to do our bidding, we have passed it a clone of the matrix that we just read from the input. This is just so that we are able to compare the changes that the processing has made on our input (not that there is any, in this particular case!). From the next transformation onwards, we are not going to write down the full detailed code (along with the processImage() and main()). We'll focus on the computation of the lookup table because that is what varies from one transformation to the next.

Negative transformation

The negative transformation subtracts 255 from the input pixel intensity value and produces that as an output. Mathematically speaking, the negative transformation can be expressed as follows:

s=T(r)=(255-r)

This means that a value of 0 in the input (black) gets mapped to 255 (white) and vice versa. Similarly, lighter shades of gray will yield the corresponding darker shades from the other end of the grayscale spectrum. If the range of the input values lie between 0 and 255, then the output will also lie within the same range. If you aren't convinced, here is some mathematical proof for you:

(Assuming the input pixels are from an 8-bit grayscale color space)

(Multiplying by-1)

(Adding 255)

Some books prefer to express the negative transformation as follows:

s=T(r)=(N-1-r)

Here, N is the number of grayscale levels in the color space we are dealing with. We have decided to stick with the former definition because all the color spaces we'll be dealing with in this book will involve values between 0 and 255. Hence, we can avoid the unnecessary fastidiousness.

The implementation of a lookup table for the negative transform is also fairly straightforward. We just replace the i with (255-i):

vector<uchar> getNegativeLUT() { 
  vector<uchar> LUT(256, 0); 
  for (int i = 0; i < 256; ++i) 
    LUT[i] = (uchar)(255 - i); 
  return LUT; 
}

Let's run the code for negative transformation on some images and check what kind of an effect it produces:

The preceding image serves as our input image and the output (negative transformed image) is as follows:.

You can notice the darker pixels that make up the woman's hair (and feathers on the cap) have been transformed to white, and so have the eyes. The shoulders, which were on the fairer side of the spectrum, have been turned darker by the transform. You can now start to appreciate the kind of visual changes that this conceptually simple transformation seems to bring about in images! We iterate, once again, that the image manipulation apps, at the most basic, operate on similar principles.

If you are wondering who the lady in the image is, she goes by the name, Lena. Rather surprisingly, Lena's photograph with her iconic pose has become the de facto standard in the image processing community. A lot of literature in the field of image processing and computer vision uses this photo as an example to demonstrate the workings of some algorithm or technique. So, this is not the first time you'll be seeing her in this book!

Before we finish the section on linear transformations, there is one more thought that I would like to leave you with. You might find some texts out there that like to visualize these transformations graphically as a plot between the input and output pixels. The reason that such visualization is made is because not all linear transformations are as simple as the ones discussed here. We will briefly discuss piecewise linear transformations, where a graphical plot provides a convenient medium to analyze the transformation. But, before that, you will find such a plot for both the identity and the negative transformation drawn on the same graph:

Both the linear transformations that we have discussed so far--identity and negative--are fairly trivial ones. You can have much more complicated forms for s=T(r). For example, instead of the transformation function being linear throughout the entire domain (0 to 255), we can make it piecewise linear. That would mean splitting the input domain into multiple, contiguous ranges and defining a linear transformation for each range, something along the lines of this:

If you are wondering what purpose such a transformation achieves, the answer to that is contrast enhancement. When we say that an image has poor contrast, we mean that some (or all) parts of the image are not clearly distinguishable from their surroundings and neighbors. This happens when the pixels that make up the part of the image all belong to a very narrow band of intensity values. In the graph, consider the portion of the input intensity values that lie around L/2. You will notice that the narrow range of input pixels are mapped to a much wider range of output intensity values. This has been made possible by the steep line (greater slope), which defines the linear transformation around that region. As a result, all pixels that have intensity values within that range will get mapped to the wider output range, thereby improving the contrast of the image.

Now, astute readers might have realized that the shape of the piecewise linear transformation is dependent on the position of the points and So, how do we decide the location of the two points? Unfortunately, there is no single, specific answer to this question. As you will learn throughout the course of this book, in most cases, there can be no global correct answer for the selection of such parameters in image processing or computer vision. It depends on the kind of data (within the domain of computer vision, data often equates to images) that we are given to work with.

It would be a good exercise to try and implement the lookup table for a piecewise linear transformation. You can control the shape of the curve by varying the position of the two points and try to see what kind of an effect it has on the algorithm performance.

Learning OpenCV 3 Application Development

By : Samyak Datta

Learning OpenCV 3 Application Development

By: Samyak Datta

Overview of this book

Related Content you might be interested in

Current Title:

Learning OpenCV 3 Application Development

Linear transformations

Identity transformation

Negative transformation