Learning OpenCV 3 Application Development

Learning OpenCV 3 Application Development

By : Samyak Datta

Buy this Book

Learning OpenCV 3 Application Development

By: Samyak Datta

Buy this Book

Overview of this book

Computer vision and machine learning concepts are frequently used in practical computer vision based projects. If you’re a novice, this book provides the steps to build and deploy an end-to-end application in the domain of computer vision using OpenCV/C++. At the outset, we explain how to install OpenCV and demonstrate how to run some simple programs. You will start with images (the building blocks of image processing applications), and see how they are stored and processed by OpenCV. You’ll get comfortable with OpenCV-specific jargon (Mat Point, Scalar, and more), and get to know how to traverse images and perform basic pixel-wise operations. Building upon this, we introduce slightly more advanced image processing concepts such as filtering, thresholding, and edge detection. In the latter parts, the book touches upon more complex and ubiquitous concepts such as face detection (using Haar cascade classifiers), interest point detection algorithms, and feature descriptors. You will now begin to appreciate the true power of the library in how it reduces mathematically non-trivial algorithms to a single line of code! The concluding sections touch upon OpenCV’s Machine Learning module. You will witness not only how OpenCV helps you pre-process and extract features from images that are relevant to the problems you are trying to solve, but also how to use Machine Learning algorithms that work on these features to make intelligent predictions from visual data!

Learning OpenCV 3 Application Development

Credits

About the Author

About the Reviewer

www.PacktPub.com

Preface

Free Chapter

Laying the Foundation

Digital image basics

Introduction to the Mat class

Exploring the Mat class: loading images

Exploring the Mat class - declaring Mat objects

Digging inside Mat objects

Traversing Mat objects

Image enhancement

Lookup tables

Linear transformations

Logarithmic transformations

Summary

Image Filtering

Neighborhood of a pixel

Image averaging

Image filters

Image averaging in OpenCV

Blurring an image in OpenCV

Gaussian smoothing

Gaussian function and Gaussian filtering

Gaussian filtering in OpenCV

Using your own filters in OpenCV

Image noise

Vignetting

Implementing Vignetting in OpenCV

Summary

Image Thresholding

Binary images

Image thresholding basics

Image thresholding in OpenCV

Types of simple image thresholding

Adaptive thresholding

Morphological operations

Erosion and dilation

Erosion and dilation in OpenCV

Summary

Image Histograms

The basics of histograms

Histograms in OpenCV

Plotting histograms in OpenCV

Color histograms in OpenCV

Multidimensional histograms in OpenCV

Summary

Image Derivatives and Edge Detection

Image derivatives

Image derivatives in two dimensions

Visualizing image derivatives with OpenCV

The Sobel derivative filter

From derivatives to edges

The Sobel detector - a basic framework for edge detection

The Canny edge detector

Image noise and edge detection

Laplacian - yet another edge detection technique

Blur detection using OpenCV

Summary

Face Detection Using OpenCV

Image classification systems

Face detection in OpenCV

Gender classification

Working with real datasets

Summary

Affine Transformations and Face Alignment

Exploring the dataset

Face alignment - the first step in facial analysis

Rotating faces

Image cropping -- basics

Image cropping for face alignment

Face alignment - the complete pipeline

Summary

Feature Descriptors in OpenCV

Introduction to the local binary pattern

A basic implementation of LBP

Variants of LBP

What does LBP capture?

Applying LBP to aligned facial images

A complete implementation of LBP

Putting it all together - the main() function

Summary

Machine Learning with OpenCV

What is machine learning

Supervised and unsupervised learning

Revisiting the image classification framework

k-means clustering - the basics

k-nearest neighbors classifier - introduction

Support vector machines (SVMs) - introduction

Non-linear SVMs

Using an SVM as a gender classifier

Overfitting

Cross-validation

Common evaluation metrics

The P-R curve

Some qualitative results

Summary

Command-line Arguments in C++

Introduction to command-line arguments

Summary

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Digital image basics

Digital images are composed of a two-dimensional grid of pixels. These pixels can be thought of as the most fundamental and basic building blocks of images. When you view an image, either in its printed form on paper or in its digital format on computer screens, televisions, and mobile phones, what you see is a dense cluster of pixels arranged in a two-dimensional grid of rows and columns. Our eyes are of course not able to differentiate one individual pixel from its neighbor, and hence, images appear continuous to us. But, in reality, every image is composed of thousands and sometimes millions of discrete pixels.

Every single one of these pixels carries some information, and the sum total of all this information makes up the entire image and helps us see the bigger picture. Some of the pixels are light, some are dark. Each of them is colored with a different hue. There are grayscale images, which are commonly known as black and white images. We will avoid the use of the latter phrase because in image processing jargon, black and white refers to something else all together. It does not take an expert to deduce that colored images hold a lot more visual detail than their grayscale counterparts.

So, what pieces of information do these individual, tiny pixels store that enable them to create the images that they are a part of? How does a grayscale image differ from a colored one? Where do the colors come from? How many of them are there? Let's answer all these questions one by one.

Pixel intensities

There are countless sophisticated instruments that aid us in the process of acquiring images from nature. At the most basic level, they work by capturing light rays as they enter through the aperture of the instrument's lens and fall on a photographic plate. Depending on the orientation, illumination, and other parameters of the photo-capturing device, the amount of light that falls on each spatial coordinate of the film differs. This variation in the intensity of light falling on the film is encoded as pixel values when the image is stored in a digital format. Therefore, the information stored by a pixel is nothing more than a quantitative measure of the intensity of light that illuminated that particular spatial coordinate while the image was being captured. What this essentially means is that any image that you see, when represented digitally, is reduced to a two-dimensional grid of values where each pixel in the image is assigned a numerical value that is directly proportional to the intensity of light falling on that pixel in the natural image.

Color depth and color spaces

Now we come to the issue of encoding light intensity in pixel values. If you have studied a programming language before, you might be aware that the range and the type of values that you can store in any data structure are closely linked to the data type. A single bit can represent two values: 0 and 1. Eight bits (also known as a byte) can accommodate different values. Going further along, an int (represented using 32 bits in most architectures) data type has the capacity to represent roughly 4.29 billion different entries. Extending the same logic to digital images, the range of values that can be used to represent the pixel intensities depends on the data type we select for storing the image. In the world of image processing, the term color space or color depth is used in place of data type.

The most common and simplest color space for representing images is using 8 bits to represent the value of each pixel. This means that each pixel can have any value between 0 and 255 (inclusive). Images made up of such color spaces are called grayscale images. By convention, 0 represents black, 255 represents white, and each of the other values between 0 and 255 stand for a different shade of gray. The following figure demonstrates such an 8-bit color space. As we move from left to right in the following figure, the grayscale values in the image gradually change from 0 to 255:

So, if we have a grayscale image, such as the following one, then to a digital medium, it is merely a matrix of values-where each element of the matrix is a grayscale value between 0 (black) to 255 (white). This grid of pixel intensity values is represented for a tiny sub-section of the image (a portion of one of the wing mirrors of the car).

Color channels

We have seen that using 8 bits is sufficient to represent grayscale images in digital media. But how do we represent colors? This brings us to the concept of color channels. A majority of the images that you come across are colored as opposed to grayscale. In the case of the image we just saw, each pixel is associated with a single intensity value (between 0 and 255). For color images, each pixel has three values or components: the red (R), green (G), and blue (B) components. It is a well-known fact that all possible colors can be represented as a combination of the R, G, and B components, and hence, the triplet of intensity values at each pixel are sufficient to represent the entire spectrum of colors in the image. Also, note that each of the three R, G, and B values at every pixel are stored using 8 bits, which makes it 8 x 3 = 24 bits per pixel. This means that the color space now increases to more than 16 million colors from a mere 256. This is the reason color images store much more information than their grayscale counterparts.

Conceptually, the color image is not treated as having a triplet of intensity values at each pixel. Rather, a more convenient form of representation is adopted. The image is said to possess three independent color channels: the R, G, and B channels. Now, since we are using 8 bits per pixel per channel, each of the three channels are grayscale images in themselves!

Learning OpenCV 3 Application Development

By : Samyak Datta

Learning OpenCV 3 Application Development

By: Samyak Datta

Overview of this book

Related Content you might be interested in

Current Title:

Learning OpenCV 3 Application Development

Digital image basics

Pixel intensities

Color depth and color spaces

Color channels