Book Image

Learning OpenCV 3 Application Development

By : Samyak Datta
Book Image

Learning OpenCV 3 Application Development

By: Samyak Datta

Overview of this book

Computer vision and machine learning concepts are frequently used in practical computer vision based projects. If you’re a novice, this book provides the steps to build and deploy an end-to-end application in the domain of computer vision using OpenCV/C++. At the outset, we explain how to install OpenCV and demonstrate how to run some simple programs. You will start with images (the building blocks of image processing applications), and see how they are stored and processed by OpenCV. You’ll get comfortable with OpenCV-specific jargon (Mat Point, Scalar, and more), and get to know how to traverse images and perform basic pixel-wise operations. Building upon this, we introduce slightly more advanced image processing concepts such as filtering, thresholding, and edge detection. In the latter parts, the book touches upon more complex and ubiquitous concepts such as face detection (using Haar cascade classifiers), interest point detection algorithms, and feature descriptors. You will now begin to appreciate the true power of the library in how it reduces mathematically non-trivial algorithms to a single line of code! The concluding sections touch upon OpenCV’s Machine Learning module. You will witness not only how OpenCV helps you pre-process and extract features from images that are relevant to the problems you are trying to solve, but also how to use Machine Learning algorithms that work on these features to make intelligent predictions from visual data!
Table of Contents (16 chapters)
Learning OpenCV 3 Application Development
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Exploring the Mat class - declaring Mat objects


We have just witnessed the creation of a Mat object by reading an image from disk. Is loading an existing image the only way to create Mat objects in code? Well, the answer is no. It would be prudent to assume that there are other ways to declare and initialize instances of the Mat class. In the subsequent sections, we will be discussing some of the methods in great detail. As we move along the discussions, we will touch upon the different aspects of digital images that we introduced at the beginning of this chapter. You will see how the concepts of spatial resolution (image dimensions), color spaces (bit depths or data types), and color channels are all elegantly handled by the Mat class.

Let's see a sample line of code that both declares and initializes a Mat object:

Mat M(20, 15, CV_8UC3, Scalar(0,0,255));

Spatial dimensions of an image

The first two arguments define the dimensions of the data matrix, that is, rows and columns, respectively. So the previous example will create a Mat object with a data matrix comprising 20 rows and 15 columns, which means a total of 20 x 15 = 300 elements. Often, you will see Mat declarations where both of these values are combined into a single argument: the Size object. The Size object, more specifically, the Size_ template class, is an OpenCV specific class that allows us to specify sizes for images and rectangles. It has two members: width and height. So, if you are using a Size object to specify the dimensions of a Mat, the height and width correspond to the number of rows and columns, respectively. The same Mat instantiation using a Size object is given as follows:

Mat M(Size(15, 20), CV_8UC3, Scalar(0,0,255)); 

There are a couple of things that are noteworthy regarding the preceding line of code. First, note that the number of rows and columns are in the reverse order with respect to the previous instantiation. This is because the constructor for the Size_ class accepts the arguments in this order-width and height. Second, note that although the class is templatized and named Size_, in the declaration, we simply use Size. This is due to the fact that OpenCV has defined some aliases as follows:

typedef Size_<int> Size2i; 
typedef Size2i Size; 

This basically means that writing Size is equivalent to saying Size2i, which in turn is the same as Size_<int>.

Color space or color depth

The next argument to the Mat declaration statement discussed earlier is for the type. This parameter defines the type of values that the data matrix of the Mat object would store. The choice of this parameter becomes important because it controls the amount of space needed to store the Mat object in memory. OpenCV has its own types defined. A mapping between the OpenCV types and C++ data types is given in the following table:

Serial No.

OpenCV type

Equivalent C++ type

Range

0

CV_8U

unsigned char

0 to 255

1

CV_8S

char

-128 to 127

2

CV_16U

unsigned short

0 to 65535

3

CV_16S

short

-32768 to 32767

4

CV_32S

int

-2147483648 to 2147483647

5

CV_32F

float

6

CV_64F

double

16, 32, and 64 represent the number of bits used for storing a value of that data type. U, S, and F stand for unsignedshort, and float, respectively. Using these two pieces of information, we can easily deduce the range of values for each data type, as given in the right-most column of the table.

Color channels

You will notice a C followed by a number in the types used to declare our Mat objects (for example, CV_8UC3). The C here stands for channel, and the integer following it gives you the number of channels in the image. Given a multi-channel, RGB image, OpenCV provides you with a split() function that separates the three channels. Here is a short code snippet that demonstrates this:

Mat color_image = imread("lena.jpg", IMREAD_COLOR); 
vector<Mat> channels; 
split(color_image, channels); 
 
imshow("Blue", channels[0]); 
imshow("Green", channels[1]); 
imshow("Red", channels[2]); 
waitKey(0);

Image size

By looking at the complete OpenCV type (along with the number of channels) and the Mat object dimensions, we can actually calculate the number of bits that would be required to store all the pixel values in memory. For example, let's say we have a 100 x 100 Mat object of type CV_8UC3. Each pixel value will take 8 bits and there will be three such values for each pixel (three channels). That takes it to 24 bits per pixel. There are 100 x 100 = 10,000 pixels in total, which means a total space of (24 x 10,000) bits = 30 kilobytes. Keep in mind that this is the space used up by the grid of pixel values and does not include the header. The overall size of the Mat object will be higher, but not by a significant amount (the size of the data matrix is substantially larger than the size of a Mat header).

By looking at the range of data types available for declaring Mat objects, it's natural to think about the utility of all the different types. For storing and representing images, only CV_8UC1 and CV_8UC3 make sense, the former for grayscale images and the latter for images in the RGB color space. As stated earlier, in OpenCV, the Mat object is used for much more than an image store. For applications where Mat is best treated as a multidimensional numerical array, the other types make sense. However, irrespective of whether the Mat object serves as an image store or as a data structure, its importance and ubiquity inside the world of OpenCV is undeniable.

Default initialization value

The last argument is the default value for the data matrix of the Mat object. You will have noticed the use of yet another OpenCV specific data structure: Scalar. The Scalar_ class allows you to store a vector of at most four values. You might be wondering about the utility of restricting the size of a vector to just four. There are several use cases within OpenCV where we might require working with one, two, or three values (and not more than that). For example, we have just learnt that each pixel in an RGB image is represented using three values, one each for the R, G, and B channels. In such a scenario, the Scalar object provides a convenient method to pass the group of three values to the Mat object constructor, as has been done in the example under consideration. One important thing to note is that OpenCV reads the color channels in the reverse order-B, G, and R. This means that passing Scalar(255, 0, 0) would refer to blue, whereas Scalar(0, 0, 255) is red. Any combination of the three would then represent one of the 16 million+ colors. If, at this point of time, you are wondering about providing default values for a grayscale image, OpenCV allows what is intuitive. A simple Scalar(0) or Scalar(255) will color all pixels black or white, respectively, by default in a grayscale image. What I mean to say is that the constructor for the Scalar object is flexible enough to accept one, two, three, or even four values. If you are wondering about the discrepancy in the class names, Scalar_ and Scalar, similar to the Size_ class, OpenCV defines the following alias to make our code less verbose:

typedef Scalar_<double> Scalar; 

The initialization method that we discussed here involved passing all the three pieces of information as arguments:

  • Dimensions of the image

  • The type of data stored at each pixel location

  • The initial value to be filled in the data matrix

However, the Mat class allows greater flexibility in declaring objects. You do not have to specify all the three mentioned earlier. The Mat class has some overloaded constructors that allow you to declare objects even if you simply specify the following:

  • Nothing at all

  • The dimensions and the type

  • The dimensions, the type, and the initial value

Here are some of the constructor declarations from the implementation of the Mat class:

Mat () 
Mat (int rows, int cols, int type) 
Mat (Size size, int type) 
Mat (int rows, int cols, int type, const Scalar &s) 
Mat (Size size, int type, const Scalar &s) 

Going by the preceding definitions, we present some sample valid Mat object declarations. You will get a chance to see them being used in the programs that we write as part of this book:

Mat I; 
Mat I(100, 80, CV_8UC1); 
Mat I(Size(80, 100), CV_8UC1); 

Before we finish this section on declaring Mat objects, we will discuss one final technique, that is, creating Mat objects as a region of interest (ROI) from inside an existing Mat object. Often, situations arise where we are interested in a subset of the data from the data matrix of an existing Mat object. Putting it another way, we would like to initialize a new Mat object whose data matrix is a submatrix of the existing Mat object. The constructor for such an initialization is given as Mat (const Mat &m, const Rect &roi). A sample statement that invokes such a constructor is given as follows:

Mat roi_image(original_image, Rect(10, 10, 100, 100)); 

This will create a new Mat object named roi_image by taking the data from the matrix belonging to the existing Mat object, original_image. The submatrix will start from the pixel with coordinates (10, 10) as the upper-left corner and will have dimensions of 100 x 100. All the information pertaining to the size of the ROI has been passed via the Rect object, which is yet another OpenCV specific data structure.