Book Overview & Buying
Table Of Contents

Mastering Computer Vision with TensorFlow 2.x

By : Krishnendu Kar

3.8 (4)

Buy this Book

Mastering Computer Vision with TensorFlow 2.x

3.8 (4)

By: Krishnendu Kar

Buy this Book

Overview of this book

Computer vision allows machines to gain human-level understanding to visualize, process, and analyze images and videos. This book focuses on using TensorFlow to help you learn advanced computer vision tasks such as image acquisition, processing, and analysis. You'll start with the key principles of computer vision and deep learning to build a solid foundation, before covering neural network architectures and understanding how they work rather than using them as a black box. Next, you'll explore architectures such as VGG, ResNet, Inception, R-CNN, SSD, YOLO, and MobileNet. As you advance, you'll learn to use visual search methods using transfer learning. You'll also cover advanced computer vision concepts such as semantic segmentation, image inpainting with GAN's, object tracking, video segmentation, and action recognition. Later, the book focuses on how machine learning and deep learning concepts can be used to perform tasks such as edge detection and face recognition. You'll then discover how to develop powerful neural network models on your PC and on various cloud platforms. Finally, you'll learn to perform model optimization methods to deploy models on edge devices for real-time inference. By the end of this book, you'll have a solid understanding of computer vision and be able to confidently develop models to automate tasks.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Section 1: Introduction to Computer Vision and Neural Networks

Free Chapter

Computer Vision and TensorFlow Fundamentals

Technical requirements

Detecting edges using image hashing and filtering

Extracting features from an image

Object detection using Contours and the HOG detector

An overview of TensorFlow, its ecosystem, and installation

Summary

Content Recognition Using Local Binary Patterns

Processing images using LBP

Applying LBP to texture recognition

Matching face color with foundation color – LBP and its limitations

Matching face color with foundation color – color matching technique

Summary

Facial Detection Using OpenCV and CNN

Applying Viola-Jones AdaBoost learning and the Haar cascade classifier for face recognition

Predicting facial key points using a deep neural network

Predicting facial expressions using a CNN

Overview of 3D face detection

Summary

Deep Learning on Images

Understanding CNNs and their parameters

Optimizing CNN parameters

Visualizing the layers of a neural network

Summary

Section 2: Advanced Concepts of Computer Vision with TensorFlow

Neural Network Architecture and Models

Overview of AlexNet

Overview of VGG16

Overview of Inception

Overview of ResNet

Overview of R-CNN

Overview of Fast R-CNN

Overview of Faster R-CNN

Overview of GANs

Overview of GNNs

Overview of Reinforcement Learning

Overview of Transfer Learning

Summary

Visual Search Using Transfer Learning

Coding deep learning models using TensorFlow

Developing a transfer learning model using TensorFlow

Understanding the architecture and applications of visual search

Working with a visual search input pipeline using tf.data

Summary

Object Detection Using YOLO

An overview of YOLO

An introduction to Darknet for object detection

Real-time prediction using Darknet

YOLO versus YOLO v2 versus YOLO v3

When to train a model?

Training your own image set with YOLO v3 to develop a custom model

An overview of the Feature Pyramid Network and RetinaNet

Summary

Semantic Segmentation and Neural Style Transfer

Overview of TensorFlow DeepLab for semantic segmentation

Artificial image generation using DCGANs

Image inpainting using OpenCV

Understanding neural style transfer

Summary

Section 3: Advanced Implementation of Computer Vision with TensorFlow

Action Recognition Using Multitask Deep Learning

Human pose estimation – OpenPose

Human pose estimation – stacked hourglass model

Human pose estimation – PoseNet

Action recognition using various methods

Summary

Object Detection Using R-CNN, SSD, and R-FCN

An overview of SSD

An overview of R-FCN

An overview of the TensorFlow object detection API

Detecting objects using TensorFlow on Google Cloud

Detecting objects using TensorFlow Hub

Training a custom object detector using TensorFlow and Google Colab

An overview of Mask R-CNN and a Google Colab demonstration

Developing an object tracker model to complement the object detector

Summary

Section 4: TensorFlow Implementation at the Edge and on the Cloud

Deep Learning on Edge Devices with CPU/GPU Optimization

Overview of deep learning on edge devices

Techniques used for GPU/CPU optimization

Overview of MobileNet

Image processing with a Raspberry Pi

Model conversion and inference using OpenVINO

Converting a TensorFlow model developed using the TensorFlow Object Detection API

Application of TensorFlow Lite

Object detection on Android phones using TensorFlow Lite

Object detection on Raspberry Pi using TensorFlow Lite

Object detection on iPhone using TensorFlow Lite and Create ML

A summary of various annotation methods

Summary

Cloud Computing Platform for Computer Vision

Training an object detector in GCP

Training an object detector in the AWS SageMaker cloud platform

Training an object detector in the Microsoft Azure cloud platform

Training at scale and packaging

The general idea behind cloud-based visual search

Analyzing images and search mechanisms in various cloud platforms

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Summary

In this chapter, we learned the building blocks of the YOLO object detection method and learned how it can detect an object so quickly and accurately compared to other object detection methods. We learned about different evolutions of YOLO—the original version of YOLO, YOLO v2, and YOLO v3—and their differences. We used YOLO to detect an object in an image and video file, such as traffic signs.

We learned how to debug YOLO v3 so that it can generate correct outputs without crashing. We understood how to use pretrained YOLO to make an inference and learned the detailed process for using our custom image to develop a new YOLO model and how to tune CNN parameters to generate correct results. This chapter also introduced you to RetinaNet and how it uses the concept of a feature pyramid to detect objects of different scales.

In the next chapter, we will be learning...

Tech Concepts

Programming languages

Tech Tools