Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By : Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo
Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By: Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Overview of this book

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time! We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation. After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks. Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell

Improving generalization by regularizing

So far in this chapter, we have seen how we would use TensorFlow to train a convolutional neural network for the task of image classification. After we trained our model, we ran it through the test set, which was stored away at the start, to see how well it would perform on data it had never seen before. This process of evaluating our model on a test set gives us an indication of how well our model will generalize when we deploy it. A model that generalizes well is clearly a desirable property to have, as it allows it to be used in many situations.



What CNN architecture we use is one of the ways that we can improve the generalization ability of our model. One simple technique to keep in mind is to start by designing your model as simply as possible with few layers or filters. Since very small models are more likely to underfit to your data, you can slowly add complexity until underfitting stops occurring. If you design your models this way, it limits...