Book Image

Learning OpenCV 4 Computer Vision with Python 3 - Third Edition

By : Joseph Howse, Joe Minichino
Book Image

Learning OpenCV 4 Computer Vision with Python 3 - Third Edition

By: Joseph Howse, Joe Minichino

Overview of this book

Computer vision is a rapidly evolving science, encompassing diverse applications and techniques. This book will not only help those who are getting started with computer vision but also experts in the domain. You’ll be able to put theory into practice by building apps with OpenCV 4 and Python 3. You’ll start by understanding OpenCV 4 and how to set it up with Python 3 on various platforms. Next, you’ll learn how to perform basic operations such as reading, writing, manipulating, and displaying still images, videos, and camera feeds. From taking you through image processing, video analysis, and depth estimation and segmentation, to helping you gain practice by building a GUI app, this book ensures you’ll have opportunities for hands-on activities. Next, you’ll tackle two popular challenges: face detection and face recognition. You’ll also learn about object classification and machine learning concepts, which will enable you to create and use object detectors and classifiers, and even track objects in movies or video camera feed. Later, you’ll develop your skills in 3D tracking and augmented reality. Finally, you’ll cover ANNs and DNNs, learning how to develop apps for recognizing handwritten digits and classifying a person's gender and age. By the end of this book, you’ll have the skills you need to execute real-world computer vision projects.
Table of Contents (13 chapters)

Project Cameo (face tracking and image manipulation)

OpenCV is often studied through a cookbook approach that covers a lot of algorithms, but nothing about high-level application development. To an extent, this approach is understandable because OpenCV's potential applications are so diverse. OpenCV is used in a wide variety of applications, such as photo/video editors, motion-controlled games, a robot's AI, or psychology experiments where we log participants' eye movements. Across these varied use cases, can we truly study a useful set of abstractions?

The book's authors believe we can, and the sooner we start creating abstractions, the better. We will structure many of our OpenCV examples around a single application, but, at each step, we will design a component of this application to be extensible and reusable.

We will develop an interactive application that performs face tracking and image manipulations on camera input in real time. This type of application covers a broad range of OpenCV's functionality and challenges us to create an efficient, effective implementation.

Specifically, our application will merge faces in real time. Given two streams of camera input (or, optionally, prerecorded video input), the application will superimpose faces from one stream atop faces in the other. Filters and distortions will be applied to give this blended scene a unified look and feel. Users should have the experience of being engaged in a live performance where they enter another environment and persona. This type of user experience is popular in amusement parks such as Disneyland.

In such an application, users would immediately notice flaws, such as a low frame rate or inaccurate tracking. To get the best results, we will try several approaches using conventional imaging and depth imaging.

We will call our application Cameo. A cameo (in jewelry) is a small portrait of a person or (in film) a very brief role played by a celebrity.