Deep Learning for Computer Vision

Deep Learning for Computer Vision

By : Rajalingappaa Shanmugamani

Buy this Book

Deep Learning for Computer Vision

By: Rajalingappaa Shanmugamani

Buy this Book

Overview of this book

Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning. In this book, you will learn different techniques related to object classification, object detection, image segmentation, captioning, image generation, face analysis, and more. You will also explore their applications using popular Python libraries such as TensorFlow and Keras. This book will help you master state-of-the-art, deep learning algorithms and their implementation.

Title Page

Packt Upsell

Foreword

Contributors

Preface

Free Chapter

Getting Started

Understanding deep learning

Deep learning for computer vision

Development environment setup

Summary

Image Classification

Training the MNIST model in TensorFlow

Training the MNIST model in Keras

Other popular image testing datasets

The bigger deep learning models

Training a model for cats versus dogs

Developing real-world applications

Summary

Image Retrieval

Understanding visual features

Model inference

Content-based image retrieval

Summary

Object Detection

Detecting objects in an image

Exploring the datasets

Localizing algorithms

Detecting objects

Object detection API

The YOLO object detection algorithm

Summary

Semantic Segmentation

Predicting pixels

Datasets

Algorithms for semantic segmentation

Ultra-nerve segmentation

Segmenting satellite images

Segmenting instances

Summary

Similarity Learning

Algorithms for similarity learning

Human face analysis

Summary

Image Captioning

Understanding the problem and datasets

Understanding natural language processing for image captioning

Approaches for image captioning and related problems

Implementing attention-based image captioning

Summary

Generative Models

Applications of generative models

Neural artistic style transfer

Generative Adversarial Networks

Visual dialogue model

Summary

Video Classification

Understanding and classifying videos

Extending image-based approaches to videos

Summary

Deployment

Performance of models

Deployment in the cloud

Deployment of models in devices

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Preface

Deep Learning for Computer Vision is a book intended for readers who want to learn deep-learning-based computer vision techniques for various applications. This book will give the reader tools and techniques to develop computer-vision-based products. There are plenty of practical examples covered in the book to follow the theory.

Who this book is for

The reader wants to know how to apply deep learning to computer vision problems such as classification, detection, retrieval, segmentation, generation, captioning, and video classification. The reader also wants to understand how to achieve good accuracy under various constraints such as less data, imbalanced classes, and noise. Then the reader also wants to know how to deploy trained models on various platforms (AWS, Google Cloud, Raspberry Pi, and mobile phones). After completing this book, the reader should be able to develop code for problems of person detection, face recognition, product search, medical image segmentation, image generation, image captioning, video classification, and so on.

What this book covers

Chapter 1, Getting Started, introduces the basics of deep learning and makes the readers familiar with the vocabulary. The readers will install the software packages necessary to follow the rest of the chapters.

Chapter 2, Image Classification, talks about the image classification problem, which is labeling an image as a whole. The readers will learn about image classification techniques and train a deep learning model for pet classification. They will also learn methods to improve accuracy and dive deep into variously advanced architectures.

Chapter 3, Image Retrieval, covers deep features and image retrieval. The reader will learn about various methods of obtaining model visualization, visual features, inference using TensorFlow, and serving and using visual features for product retrieval.

Chapter 4, Object Detection, talks about detecting objects in images. The reader will learn about various techniques of object detection and apply them for pedestrian detection. The TensorFlow API for object detection will be utilized in this chapter.

Chapter 5, Semantic Segmentation, covers segmenting of images pixel-wise. The readers will earn about segmentation techniques and train a model for segmentation of medical images.

Chapter 6, Similarity Learning, talks about similarity learning. The readers will learn about similarity matching and how to train models for face recognition. A model to train facial landmark is illustrated.

Chapter 7, Image Captioning, is about generating or selecting captions for images. The readers will learn natural language processing techniques and how to generate captions for images using those techniques.

Chapter 8, Generative Models, talks about generating synthetic images for various purposes. The readers will learn what generative models are and use them for image generation applications, such as style transfer, training data, and so on.

Chapter 9, Video Classification, covers computer vision techniques for video data. The readers will understand the key differences between solving video versus image problems and implement video classification techniques.

Chapter 10, Deployment, talks about the deployment steps for deep learning models. The reader will learn how to deploy trained models and optimize for speed on various platforms.

To get the most out of this book

The examples covered in this book can be run with Windows, Ubuntu, or Mac. All the installation instructions are covered. Basic knowledge of Python and machine learning is required. It's preferable that the reader has GPU hardware but it's not necessary.

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at www.packtpub.com.
Select the SUPPORT tab.
Click on Code Downloads & Errata.
Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Deep-Learning-for-Computer-Vision. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Note that the graph is written once with the summary_writer."

A block of code is set as follows:

merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
test_summary_writer = tf.summary.FileWriter('/tmp/test')

Any command-line input or output is written as follows:

wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Once you are done, terminate the instance by clicking Actions|Instance State|Terminat."

Note

Warnings or important notes appear like this.

Note

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.

Deep Learning for Computer Vision

By : Rajalingappaa Shanmugamani

Deep Learning for Computer Vision

By: Rajalingappaa Shanmugamani

Overview of this book

Related Content you might be interested in

Current Title:

Deep Learning for Computer Vision

TensorFlow Deep Learning Projects

Hands-On Computer Vision with TensorFlow 2

Practical Convolutional Neural Networks

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Note

Note

Get in touch

Reviews