Book Image

Learning OpenCV 3 Application Development

By : Samyak Datta
Book Image

Learning OpenCV 3 Application Development

By: Samyak Datta

Overview of this book

Computer vision and machine learning concepts are frequently used in practical computer vision based projects. If you’re a novice, this book provides the steps to build and deploy an end-to-end application in the domain of computer vision using OpenCV/C++. At the outset, we explain how to install OpenCV and demonstrate how to run some simple programs. You will start with images (the building blocks of image processing applications), and see how they are stored and processed by OpenCV. You’ll get comfortable with OpenCV-specific jargon (Mat Point, Scalar, and more), and get to know how to traverse images and perform basic pixel-wise operations. Building upon this, we introduce slightly more advanced image processing concepts such as filtering, thresholding, and edge detection. In the latter parts, the book touches upon more complex and ubiquitous concepts such as face detection (using Haar cascade classifiers), interest point detection algorithms, and feature descriptors. You will now begin to appreciate the true power of the library in how it reduces mathematically non-trivial algorithms to a single line of code! The concluding sections touch upon OpenCV’s Machine Learning module. You will witness not only how OpenCV helps you pre-process and extract features from images that are relevant to the problems you are trying to solve, but also how to use Machine Learning algorithms that work on these features to make intelligent predictions from visual data!
Table of Contents (16 chapters)
Learning OpenCV 3 Application Development
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Image enhancement


This section is all about performing some form of computation or processing on each pixel. Since this is the beginning of the book and we are dealing with the basics, we'll let the computations be fairly simplistic for now. The more complex algorithms will be saved for the next chapter. In addition to being simplistic in nature, the computations will also involve all the pixels undergoing the same nature of transformations. The transformation function to be applied to every pixel is dependent only on the value of the current pixel. Putting it mathematically, such transformation functions can be represented as follows:

s=T(r)

Here, s is the output pixel value and r is the input. The transformation function, T, also known as the gray-level or intensity transformation function, can be thought of as a mapping between the input and output pixel values. Essentially, the pixel value at the (i, j) position in the output image is dependent only on the pixel value at the same (i, j) position in the input image. Hence, you do not see any dependency of coordinate positions (i, j) in the transformation function, just the pixel values s and r. However, these transformations are pretty naive to assume such a simple pixel-dependency model. Most of the image processing techniques work with a neighborhood of pixels around the (i, j) pixel. It is due to this reason that grayscale transformations are simple. However, they are a good starting point for our journey on image processing.

Assume that we are dealing with a grayscale image (even in the case of a color image, the R, G, and B channels can be treated separately and independently as three grayscale images). T is applied to each and every pixel in the input image to yield the output. By changing the nature of T, we can get different forms of transformations. The names of some of the transformations that we'll discuss and ultimately implement have been listed as follows:

  • Linear transformations­:

    • Identity

    • Negative

  • Logarithmic transformations:

    • Log

    • Inverse log or exponential

At this point, you can probably see the path laid out in front of you. We implement these grayscale transformations by traversing the data matrix by taking help from the arsenal of techniques we have developed in the previous section, and we apply the transformation function independently at each pixel to get the resultant image. While this approach is perfectly correct, there is yet a scope for optimization.