Book Image

Mastering Computer Vision with TensorFlow 2.x

By : Krishnendu Kar

Book Image

Mastering Computer Vision with TensorFlow 2.x

By: Krishnendu Kar

Overview of this book

Computer vision allows machines to gain human-level understanding to visualize, process, and analyze images and videos. This book focuses on using TensorFlow to help you learn advanced computer vision tasks such as image acquisition, processing, and analysis. You'll start with the key principles of computer vision and deep learning to build a solid foundation, before covering neural network architectures and understanding how they work rather than using them as a black box. Next, you'll explore architectures such as VGG, ResNet, Inception, R-CNN, SSD, YOLO, and MobileNet. As you advance, you'll learn to use visual search methods using transfer learning. You'll also cover advanced computer vision concepts such as semantic segmentation, image inpainting with GAN's, object tracking, video segmentation, and action recognition. Later, the book focuses on how machine learning and deep learning concepts can be used to perform tasks such as edge detection and face recognition. You'll then discover how to develop powerful neural network models on your PC and on various cloud platforms. Finally, you'll learn to perform model optimization methods to deploy models on edge devices for real-time inference. By the end of this book, you'll have a solid understanding of computer vision and be able to confidently develop models to automate tasks.

Preface

Who this book is for

What this book covers

To get the most out of this book

Section 1: Introduction to Computer Vision and Neural Networks

Section 1: Introduction to Computer Vision and Neural Networks

Free Chapter

Computer Vision and TensorFlow Fundamentals

Computer Vision and TensorFlow Fundamentals

Technical requirements

Detecting edges using image hashing and filtering

Extracting features from an image

Object detection using Contours and the HOG detector

An overview of TensorFlow, its ecosystem, and installation

Content Recognition Using Local Binary Patterns

Content Recognition Using Local Binary Patterns

Processing images using LBP

Applying LBP to texture recognition

Matching face color with foundation color – LBP and its limitations

Matching face color with foundation color – color matching technique

Facial Detection Using OpenCV and CNN

Facial Detection Using OpenCV and CNN

Applying Viola-Jones AdaBoost learning and the Haar cascade classifier for face recognition

Predicting facial key points using a deep neural network

Predicting facial expressions using a CNN

Overview of 3D face detection

Deep Learning on Images

Deep Learning on Images

Understanding CNNs and their parameters

Optimizing CNN parameters

Visualizing the layers of a neural network

Section 2: Advanced Concepts of Computer Vision with TensorFlow

Section 2: Advanced Concepts of Computer Vision with TensorFlow

Neural Network Architecture and Models

Neural Network Architecture and Models

Overview of AlexNet

Overview of VGG16

Overview of Inception

Overview of ResNet

Overview of R-CNN

Overview of Fast R-CNN

Overview of Faster R-CNN

Overview of GANs

Overview of GNNs

Overview of Reinforcement Learning

Overview of Transfer Learning

Visual Search Using Transfer Learning

Visual Search Using Transfer Learning

Coding deep learning models using TensorFlow

Developing a transfer learning model using TensorFlow

Understanding the architecture and applications of visual search

Working with a visual search input pipeline using tf.data

Object Detection Using YOLO

Object Detection Using YOLO

An overview of YOLO

An introduction to Darknet for object detection

Real-time prediction using Darknet

YOLO versus YOLO v2 versus YOLO v3

When to train a model?

Training your own image set with YOLO v3 to develop a custom model

An overview of the Feature Pyramid Network and RetinaNet

Semantic Segmentation and Neural Style Transfer

Semantic Segmentation and Neural Style Transfer

Overview of TensorFlow DeepLab for semantic segmentation

Artificial image generation using DCGANs

Image inpainting using OpenCV

Understanding neural style transfer

Section 3: Advanced Implementation of Computer Vision with TensorFlow

Section 3: Advanced Implementation of Computer Vision with TensorFlow

Action Recognition Using Multitask Deep Learning

Action Recognition Using Multitask Deep Learning

Human pose estimation – OpenPose

Human pose estimation – stacked hourglass model

Human pose estimation – PoseNet

Action recognition using various methods

Object Detection Using R-CNN, SSD, and R-FCN

Object Detection Using R-CNN, SSD, and R-FCN

An overview of SSD

An overview of R-FCN

An overview of the TensorFlow object detection API

Detecting objects using TensorFlow on Google Cloud

Detecting objects using TensorFlow Hub

Training a custom object detector using TensorFlow and Google Colab

An overview of Mask R-CNN and a Google Colab demonstration

Developing an object tracker model to complement the object detector

Section 4: TensorFlow Implementation at the Edge and on the Cloud

Section 4: TensorFlow Implementation at the Edge and on the Cloud

Deep Learning on Edge Devices with CPU/GPU Optimization

Deep Learning on Edge Devices with CPU/GPU Optimization

Overview of deep learning on edge devices

Techniques used for GPU/CPU optimization

Overview of MobileNet

Image processing with a Raspberry Pi

Model conversion and inference using OpenVINO

Converting a TensorFlow model developed using the TensorFlow Object Detection API

Application of TensorFlow Lite

Object detection on Android phones using TensorFlow Lite

Object detection on Raspberry Pi using TensorFlow Lite

Object detection on iPhone using TensorFlow Lite and Create ML

A summary of various annotation methods

Cloud Computing Platform for Computer Vision

Cloud Computing Platform for Computer Vision

Training an object detector in GCP

Training an object detector in the AWS SageMaker cloud platform

Training an object detector in the Microsoft Azure cloud platform

Training at scale and packaging

The general idea behind cloud-based visual search

Analyzing images and search mechanisms in various cloud platforms

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

YOLO versus YOLO v2 versus YOLO v3

A comparison of the three YOLO versions is shown in this table:

	YOLO	YOLO v2	YOLO v3
Input size	224 x 224	448 x 448
Framework	Darknet trained on ImageNet—1,000.	Darknet-19 19 convolution layers and 5 max pool layers.	Darknet-53 53 convolutional layers. For detection, 53 more layers are added, giving a total of 106 layers.
Small size detection	It cannot find small images.	Better than YOLO at detecting small images.	Better than YOLO v2 at small image detection.
		Uses anchor boxes.	Uses a residual block.

The following diagram compares the architectures of YOLO v2 and YOLO v3:

The basic convolution layers are similar, but YOLO v3 carries out detection at three separate layers: 82, 94, and 106.

The most critical item that you should take from YOLO v3 is its object detection...