Book Image

Building Computer Vision Projects with OpenCV 4 and C++

By : David Millán Escrivá, Prateek Joshi, Vinícius G. Mendonça, Roy Shilkrot
Book Image

Building Computer Vision Projects with OpenCV 4 and C++

By: David Millán Escrivá, Prateek Joshi, Vinícius G. Mendonça, Roy Shilkrot

Overview of this book

OpenCV is one of the best open source libraries available and can help you focus on constructing complete projects on image processing, motion detection, and image segmentation. This Learning Path is your guide to understanding OpenCV concepts and algorithms through real-world examples and activities. Through various projects, you'll also discover how to use complex computer vision and machine learning algorithms and face detection to extract the maximum amount of information from images and videos. In later chapters, you'll learn to enhance your videos and images with optical flow analysis and background subtraction. Sections in the Learning Path will help you get to grips with text segmentation and recognition, in addition to guiding you through the basics of the new and improved deep learning modules. By the end of this Learning Path, you will have mastered commonly used computer vision techniques to build OpenCV projects from scratch. This Learning Path includes content from the following Packt books: •Mastering OpenCV 4 - Third Edition by Roy Shilkrot and David Millán Escrivá •Learn OpenCV 4 By Building Projects - Second Edition by David Millán Escrivá, Vinícius G. Mendonça, and Prateek Joshi
Table of Contents (28 chapters)
Title Page
Copyright and Credits
About Packt
Contributors
Preface
Index

Understanding the human visual system


Before we jump into OpenCV functionalities, we need to understand why those functions were built in the first place. It's important to understand how the human visual system works so that you can develop the right algorithms.

The goal of computer vision algorithms is to understand the content of images and videos. Humans seem to do it effortlessly! So, how do we get machines to do it with the same accuracy?

Let's consider the following diagram:

The human eye captures all the information that comes along the way, such as color, shape, brightness, and so on. In the preceding image, the human eye captures all the information about the two main objects and stores it in a certain way. Once we understand how our system works, we can take advantage of it to achieve what we want.

For example, here are a few things we need to know:

  • Our visual system is more sensitive to low-frequency content than high-frequency content. Low-frequency content refers to planar regions where pixel values don't change rapidly, and high-frequency content refers to regions with corners and edges where pixel values fluctuate a lot. We can easily see if there are blotches on a planar surface, but it's difficult to spot something like that on a highly-textured surface.
  • The human eye is more sensitive to changes in brightness than to changes in color.

  • Our visual system is sensitive to motion. We can quickly recognize if something is moving in our field of vision, even though we are not directly looking at it.

  • We tend to make a mental note of salient points in our field of vision. Let's say you look at a white table with four black legs and a red dot at one of the corners of the table surface. When you look at this table, you'll immediately make a mental note that the surface and legs have opposing colors and that there is a red dot on one of the corners. Our brain is really smart that way! We do this automatically so that we can immediately recognize an object if we encounter it again.

To get an idea of our field of view, let's look at the top view of a human, and the angles at which we see various things:

Our visual system is actually capable of a lot more, but this should be good enough to get us started. You can explore further by reading up on Human Visual System (HVS) models on the web.