Book Image

Practical Convolutional Neural Networks

By : Mohit Sewak, Md. Rezaul Karim, Pradeep Pujari
Book Image

Practical Convolutional Neural Networks

By: Mohit Sewak, Md. Rezaul Karim, Pradeep Pujari

Overview of this book

Convolutional Neural Network (CNN) is revolutionizing several application domains such as visual recognition systems, self-driving cars, medical discoveries, innovative eCommerce and more.You will learn to create innovative solutions around image and video analytics to solve complex machine learning and computer vision related problems and implement real-life CNN models. This book starts with an overview of deep neural networkswith the example of image classification and walks you through building your first CNN for human face detector. We will learn to use concepts like transfer learning with CNN, and Auto-Encoders to build very powerful models, even when not much of supervised training data of labeled images is available. Later we build upon the learning achieved to build advanced vision related algorithms for object detection, instance segmentation, generative adversarial networks, image captioning, attention mechanisms for vision, and recurrent models for vision. By the end of this book, you should be ready to implement advanced, effective and efficient CNN models at your professional project or personal initiatives by working on complex image and video datasets.
Table of Contents (11 chapters)

Using attention to improve visual models


As we discovered in the NLP example covered in the earlier section on Attention Mechanism - Intuition, Attention did help us a lot in both achieving new use-cases, not optimally feasible with conventional NLP, and vastly improving the performance of the existing NLP mechanism. Similar is the usage of Attention in CNN and Visual Models as well

In the earlier chapter Chapter 7, Object-Detection & Instance-Segmentation with CNN, we discovered how Attention (like) mechanism are used as Region Proposal Networks for networks like Faster R-CNN and Mask R-CNN, to greatly enhance and optimize the proposed regions, and enable the generation of segment masks. This corresponds to the first part of the discussion. In this section, we will cover the second part of the discussion, where we will use 'Attention' mechanism to improve the performance of our CNNs, even under extreme conditions.

Reasons for sub-optimal performance of visual CNN models

The performance...