Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By : Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo
Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By: Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Overview of this book

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time! We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation. After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks. Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

VGGNet


Created by the Visual Geometry Group (VGG) at Oxford University, VGGNet was one of the first architectures to really introduce the idea of stacking a much larger number of layers together. While AlexNet was considered deep when it first came out with its seven layers, this is now a small amount compared to both VGG and other modern architectures.

VGGNet uses only very small filters with a spatial size of 3x3, compared to AlexNet, which had up to 11x11. These 3x3 convolution filters are frequently interspersed with 2x2 max pooling layers.

Using such small filters means that the neighborhood of pixels seen is also very small. Initially, this might give the impression that local information is all that is being taken into account by the model. Interestingly though, by stacking small filters one after another, it gives the same "receptive field" as a single large filter. For example, stacking three lots of 3x3 filters will have the same receptive field as one 7x7 filter.

This insight of...