Deep Learning for Computer Vision

Deep Learning for Computer Vision

By : Rajalingappaa Shanmugamani

Buy this Book

Deep Learning for Computer Vision

By: Rajalingappaa Shanmugamani

Buy this Book

Overview of this book

Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning. In this book, you will learn different techniques related to object classification, object detection, image segmentation, captioning, image generation, face analysis, and more. You will also explore their applications using popular Python libraries such as TensorFlow and Keras. This book will help you master state-of-the-art, deep learning algorithms and their implementation.

Title Page

Packt Upsell

Foreword

Contributors

Preface

Free Chapter

Getting Started

Understanding deep learning

Deep learning for computer vision

Development environment setup

Summary

Image Classification

Training the MNIST model in TensorFlow

Training the MNIST model in Keras

Other popular image testing datasets

The bigger deep learning models

Training a model for cats versus dogs

Developing real-world applications

Summary

Image Retrieval

Understanding visual features

Model inference

Content-based image retrieval

Summary

Object Detection

Detecting objects in an image

Exploring the datasets

Localizing algorithms

Detecting objects

Object detection API

The YOLO object detection algorithm

Summary

Semantic Segmentation

Predicting pixels

Datasets

Algorithms for semantic segmentation

Ultra-nerve segmentation

Segmenting satellite images

Segmenting instances

Summary

Similarity Learning

Algorithms for similarity learning

Human face analysis

Summary

Image Captioning

Understanding the problem and datasets

Understanding natural language processing for image captioning

Approaches for image captioning and related problems

Implementing attention-based image captioning

Summary

Generative Models

Applications of generative models

Neural artistic style transfer

Generative Adversarial Networks

Visual dialogue model

Summary

Video Classification

Understanding and classifying videos

Extending image-based approaches to videos

Summary

Deployment

Performance of models

Deployment in the cloud

Deployment of models in devices

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Deep learning for computer vision

Computer vision enables the properties of human vision on a computer. A computer could be in the form of a smartphone, drones, CCTV, MRI scanner, and so on, with various sensors for perception. The sensor produces images in a digital form that has to be interpreted by the computer. The basic building block of such interpretation or intelligence is explained in the next section. The different problems that arise in computer vision can be effectively solved using deep learning techniques.

Classification

Image classification is the task of labelling the whole image with an object or concept with confidence. The applications include gender classification given an image of a person's face, identifying the type of pet, tagging photos, and so on. The following is an output of such a classification task:

The Chapter 2, Image Classification, covers in detail the methods that can be used for classification tasks and in Chapter 3, Image Retrieval, we use the classification models for visualization of deep learning models and retrieve similar images.

Detection or localization and segmentation

Detection or localization is a task that finds an object in an image and localizes the object with a bounding box. This task has many applications, such as finding pedestrians and signboards for self-driving vehicles. The following image is an illustration of detection:

Segmentation is the task of doing pixel-wise classification. This gives a fine separation of objects. It is useful for processing medical images and satellite imagery. More examples and explanations can be found in Chapter 4, Object Detection and Chapter 5, Image Segmentation.

Similarity learning

Similarity learning is the process of learning how two images are similar. A score can be computed between two images based on the semantic meaning as shown in the following image:

There are several applications of this, from finding similar products to performing the facial identification. Chapter 6, Similarity learning, deals with similarity learning techniques.

Image captioning

Image captioning is the task of describing the image with text as shown [below] here:

Reproduced with permission from Vinyals et al.

The Chapter 8, Image Captioning, goes into detail about image captioning. This is a unique case where techniques of natural language processing (NLP) and computer vision have to be combined.

Generative models

Generative models are very interesting as they generate images. The following is an example of style transfer application where an image is generated with the content of that image and style of other images:

Reproduced with permission from Gatys et al.

Images can be generated for other purposes such as new training examples, super-resolution images, and so on. The Chapter 7, Generative Models, goes into detail of generative models.

Video analysis

Video analysis processes a video as a whole, as opposed to images as in previous cases. It has several applications, such as sports tracking, intrusion detection, and surveillance cameras. Chapter 9, Video Classification, deals with video-specific applications. The new dimension of temporal data gives rise to lots of interesting applications. In the next section, we will see how to set up the development environment.

Deep Learning for Computer Vision

By : Rajalingappaa Shanmugamani

Deep Learning for Computer Vision

By: Rajalingappaa Shanmugamani

Overview of this book

Related Content you might be interested in

Current Title:

Deep Learning for Computer Vision

TensorFlow Deep Learning Projects

Hands-On Computer Vision with TensorFlow 2

Practical Convolutional Neural Networks

Deep learning for computer vision

Classification

Detection or localization and segmentation

Similarity learning

Image captioning

Generative models

Video analysis