Book Image

Learning OpenCV 3 Application Development

By : Samyak Datta
Book Image

Learning OpenCV 3 Application Development

By: Samyak Datta

Overview of this book

Computer vision and machine learning concepts are frequently used in practical computer vision based projects. If you’re a novice, this book provides the steps to build and deploy an end-to-end application in the domain of computer vision using OpenCV/C++. At the outset, we explain how to install OpenCV and demonstrate how to run some simple programs. You will start with images (the building blocks of image processing applications), and see how they are stored and processed by OpenCV. You’ll get comfortable with OpenCV-specific jargon (Mat Point, Scalar, and more), and get to know how to traverse images and perform basic pixel-wise operations. Building upon this, we introduce slightly more advanced image processing concepts such as filtering, thresholding, and edge detection. In the latter parts, the book touches upon more complex and ubiquitous concepts such as face detection (using Haar cascade classifiers), interest point detection algorithms, and feature descriptors. You will now begin to appreciate the true power of the library in how it reduces mathematically non-trivial algorithms to a single line of code! The concluding sections touch upon OpenCV’s Machine Learning module. You will witness not only how OpenCV helps you pre-process and extract features from images that are relevant to the problems you are trying to solve, but also how to use Machine Learning algorithms that work on these features to make intelligent predictions from visual data!
Table of Contents (16 chapters)
Learning OpenCV 3 Application Development
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Lookup tables


Consider an image 1000 pixels high and 800 pixels wide. If we are to follow the aforementioned approach of visiting each pixel and performing the transformation T, we will have to perform the computation times. This number increases in direct proportion to the size of the image.

At the same time, we also know that the value of each pixel only lies between 0 and 255 (inclusive). What if we can pre-compute and store the transformed values s=T(r) for values s=T(r) for r∈{0,1,2,...,255}, that is, for all possible values of the input. If we do so, then irrespective of the dimensions (number of pixels) in our input image, we will never need more than 256 computations. So, using this strategy, we traverse our matrix and do a simple lookup of the pre-computed values. This is called a lookup table approach (often abbreviated as LUT). Using LUT affords us with yet another benefit with regards to implementing our transformations. The logic/code for image traversals is independent of the logic for the actual computation of the grayscale transformation. This decoupling makes our code more readable, easy to maintain, and scale (add more and more transformations to our suite). Let's have a look at an example to elucidate what I'm trying to convey:

vector<uchar> getLUT() { 
  /** 
  * This function holds the implementation details of a specific  
  * grayscale transformation 
  */ 
} 
 
void processImage(Mat& I) { 
  vector<uchar> LUT = getLUT(); 
  for (int i = 0; i < I.rows; ++i) { 
    for (int j = 0; j < I.cols; ++j) 
      I.at<uchar>(i, j) = LUT[I.at<uchar>(i, j)]; 
  } 
} 

As you can see, we have used a combination of LUT and the random-access using Mat::ptr() for matrix traversal to define a framework for implementing grayscale transformations. The getLUT() method basically returns the lookup table as a C++ vector. Usually, the vector is constructed in such a way that the input value r can be used as an index into the LUT and the value stored as the vector element is the target value s. This means that if I want to know what value the input intensity 185 is mapped to, we would simply call LUT[185] to get it. Naturally, LUT will have a size of 256 (so that the indices range from 0 to 255, thereby covering all possible input values). Now, while we traverse the data matrix in the processImage() method, we take the intensity value of each input pixel, query the LUT vector to know the desired output pixel value, and assign the new value. If you remember the section where we talked about the internals of Mat, we mentioned that Mat objects are always passed by reference and then called, and the caller function, are working with the same underlying data matrix. So, in the implementation framework that we have presented here, the same matrix will be modified and overwritten. If you want to have the original image preserved, you should create a new matrix by cloning and passing the cloned copy to the processImage() function. I guess you might have begun to appreciate the importance of learning about the internal workings of Mat now!

Let's take a moment to pause and think about what we've accomplished so far and the path that lies ahead. We have learnt about traversal of data matrix of Mat objects using a couple of different approaches. Then, to demonstrate the utility of such traversals, we introduced the concept of grayscale transformations, and talked about the design of a framework that would allow us to implement such transformation techniques.

Going forward, in the next section, when we discuss these transformations in detail, you will realize that each one of them modifies the image in its own characteristic way. They are meant to act upon certain aspects of the image and bring out the details that they are designed to exploit. That is the reason that these transformations are also referred to as image enhancement techniques. Very soon, we are going to demonstrate the different kinds of enhancements that you can bring about in your images by just transforming pixel values in accordance to a predefined function. Everyone, I guess at some point in time, has used a web or mobile-based photo/video editing application. You might recall that there is a dedicated section in such applications whose purpose is to apply these enhancements to images. In common Internet terminology, these are often referred to as filters (for example, Instagram filters). As we take you through these grayscale transformations, you will realize that at the most basic level, this is what the fancy filters really are. Of course, to design a full-scale, production-level image filter would involve a lot of different steps other than the basic s=T(r), but grayscale transformations do act as a good starting point. Without any further ado, let's learn about these transformations while building our own simple (yet cool) set of image filters by the side.