Book Image

Practical Computer Vision

By : Abhinav Dadhich
Book Image

Practical Computer Vision

By: Abhinav Dadhich

Overview of this book

In this book, you will find several recently proposed methods in various domains of computer vision. You will start by setting up the proper Python environment to work on practical applications. This includes setting up libraries such as OpenCV, TensorFlow, and Keras using Anaconda. Using these libraries, you'll start to understand the concepts of image transformation and filtering. You will find a detailed explanation of feature detectors such as FAST and ORB; you'll use them to find similar-looking objects. With an introduction to convolutional neural nets, you will learn how to build a deep neural net using Keras and how to use it to classify the Fashion-MNIST dataset. With regard to object detection, you will learn the implementation of a simple face detector as well as the workings of complex deep-learning-based object detectors such as Faster R-CNN and SSD using TensorFlow. You'll get started with semantic segmentation using FCN models and track objects with Deep SORT. Not only this, you will also use Visual SLAM techniques such as ORB-SLAM on a standard dataset. By the end of this book, you will have a firm understanding of the different computer vision techniques and how to apply them in your applications.
Table of Contents (12 chapters)

What this book covers

Chapter 1, A Fast Introduction to Computer Vision, gives a brief overview of what constitutes computer vision, its applications in different fields and subdivision of different type problems. The chapter also covers basic image input reading with code in OpenCV. There is also an overview of different color spaces and their visualizations.

Chapter 2, Libraries, Development Platforms, and Datasets, provides detailed instructions on how to set up a development environment and install libraries inside it. The various datasets introduced in this chapter include both that will be used in this book as well as currently popular datasets for each sub-domain of computer vision. The chapter includes links for downloading and loading wrappers to be used libraries such as Keras.

Chapter 3, Image Filtering and Transformations in OpenCV, explains different filtering techniques, including linear and nonlinear filters, their implementation in OpenCV. This chapter also includes techniques for transforming an image, such as linear translation, rotation around a given axis, and complete affine transformation. The techniques introduced in the chapter help in creating applications across several domains and enhancing image quality.

Chapter 4, What is a Feature? introduces the features and their importance in various applications in computer vision. The chapter consists of Harris Corner Detectors with basic features, the fast feature detector, and ORB features for both robust and fast features. There are also demonstrations in OpenCV of applications that use these. The applications include matching a template to the original image and matching two images of the same object. There is also a discussion of the black box feature and its necessity.

Chapter 5, Convolutional Neural Networks, begins with an introduction to simple neural networks and their components. The chapter also introduces convolutional neural networks in Keras with various components such as activation, pooling, and fully-connected. Results with parameter changes for each component are explained; these can be easily reproduced by the reader. This understanding is further strengthened by implementing a simple CNN model using an image dataset. Along with popular CNN architectures, VGG, Inception, and ResNet, there is an introduction to transfer learning. This leads to a look at state-of-the-art deep learning models for image classification.

Chapter 6, Feature-Based Object Detection, develops an understanding of the image recognition problem. Detection algorithms, such as face detectors, are explained with OpenCV. You will also see some recent and popular deep learning-based object detection algorithms such as FasterRCNN, SSD, and others. The effectiveness of each of these is explained with TensorFlow object detection API on custom images.

Chapter 7, Segmentation and Tracking, consists of two parts. The first introduces the image instance recognition problem, with an implementation of the deep learning model for segmentation. The second part begins with an introduction to the MOSSE tracker from OpenCV, which is both efficient and fast. An introduction to the deep learning-based tracking of multiple objects is described in tracking.

Chapter 8, 3D Computer Vision, describes analyzing images from a geometrical point of view. Readers will first understand the challenges in computing depth from a single image, and later learn how to solve them using multiple images. The chapter also describes the way to track a camera pose for moving cameras using visual odometry. Lastly, the SLAM problem is introduced, with solutions presented using the visual SLAM technique, which uses only camera images as input.

Appendix A, Mathematics for Computer Vision, introduces basic concepts required in understanding computer vision algorithms. Matrix and vector operations introduced here are further augmented with Python implementations. The appendix also contains an introduction to probability theory with explanations to various distributions.

Appendix B, Machine Learning for Computer Vision, gives an overview of machine learning modeling and various key terms involved. The readers will also understand the curse of dimensionality, the various preprocessing and postprocessing involved. There are also explanation on several evaluation tools and methods for machine learning models which are also used quite extensively for vision applications