Book Image

Deep Learning for Computer Vision

By : Rajalingappaa Shanmugamani
Book Image

Deep Learning for Computer Vision

By: Rajalingappaa Shanmugamani

Overview of this book

Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning. In this book, you will learn different techniques related to object classification, object detection, image segmentation, captioning, image generation, face analysis, and more. You will also explore their applications using popular Python libraries such as TensorFlow and Keras. This book will help you master state-of-the-art, deep learning algorithms and their implementation.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell
Foreword
Contributors
Preface

Detecting objects


There are several variants of object detection algorithms. A few algorithms that come with the object detection API are discussed here.

Regions of the convolutional neural network (R-CNN)

The first work in this series was regions for CNNs proposed by Girshick et al.(https://arxiv.org/pdf/1311.2524.pdf) . It proposes a few boxes and checks whether any of the boxes correspond to the ground truth. Selective search was used for these region proposals. Selective search proposes the regions by grouping the color/texture of windows of various sizes. The selective search looks for blob-like structures. It starts with a pixel and produces a blob at a higher scale. It produces around 2,000 region proposals. This region proposal is less when compared to all the sliding windows possible. 

The proposals are resized and passed through a standard CNN architecture such as Alexnet/VGG/Inception/ResNet. The last layer of the CNN is trained with an SVM identifying the object with a no-object...