Book Image

Learning OpenCV 5 Computer Vision with Python, Fourth Edition - Fourth Edition

By : Joseph Howse, Joe Minichino
5 (2)
Book Image

Learning OpenCV 5 Computer Vision with Python, Fourth Edition - Fourth Edition

5 (2)
By: Joseph Howse, Joe Minichino

Overview of this book

Computer vision is a rapidly evolving science in the field of artificial intelligence, encompassing diverse use cases and techniques. This book will not only help those who are getting started with computer vision but also experts in the domain. You'll be able to put theory into practice by building apps with OpenCV 5 and Python 3. You'll start by setting up OpenCV 5 with Python 3 on various platforms. Next, you'll learn how to perform basic operations such as reading, writing, manipulating, and displaying images, videos, and camera feeds. From taking you through image processing, video analysis, depth estimation, and segmentation, to helping you gain practice by building a GUI app, this book ensures you'll have opportunities for hands-on activities. You'll tackle two popular challenges: face detection and face recognition. You'll also learn about object classification and machine learning, which will enable you to create and use object detectors and even track moving objects in real time. Later, you'll develop your skills in augmented reality and real-world 3D navigation. Finally, you'll cover ANNs and DNNs, learning how to develop apps for recognizing handwritten digits and classifying a person's gender and age, and you'll deploy your solutions to the Cloud. By the end of this book, you'll have the skills you need to execute real-world computer vision projects.
Table of Contents (12 chapters)
Free Chapter
1
Learning OpenCV 5 Computer Vision with Python, Fourth Edition: Tackle tools, techniques, and algorithms for computer vision and machine learning
Appendix A: Bending Color Space with the Curves Filter

Detecting and classifying objects with third-party DNNs

For this demo, we are going to capture frames from a webcam in real-time and use a DNN to detect and classify 20 kinds of objects that may be in any given frame. Yes, a single DNN can do all this in real-time on a typical laptop that a programmer might use!

Before delving into the code, let's introduce the DNN that we will use. It is a Caffe version of a model called MobileNet-SSD, which uses a hybrid of a framework from Google called MobileNet and another framework called Single Shot Detector (SSD) MultiBox. The latter framework has a GitHub repository at https://github.com/weiliu89/caffe/tree/ssd/. The training technique for the Caffe version of MobileNet-SSD is provided by a project on GitHub at https://github.com/chuanqi305/MobileNet-SSD/. Copies of the following MobileNet-SSD files can be found in this book's repository, in the chapter10/objects_data folder:

    ...