Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By : Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo
Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By: Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Overview of this book

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time! We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation. After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks. Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Datasets


In this section, we will discuss the most important and famous recent datasets used in image classification. This is necessary, because it is likely that any perusal into Computer Vision will overlap with them (including in this book!). Before the arrival of convolutional neural networks, the two main datasets used in image classification competitions by the research community were the Caltech and PASCAL datasets.

The Caltech dataset was established by California Institute of Technology and was released in two versions. Caltech-101 was published in 2003 with 101 categories of about 40 to 800 images per category, and Caltech-256 in 2007 with 256 object categories, containing a total of 30607 images. The images were collected from Google images and PicSearch, and their size was roughly 300x400 pixels.

The Pascal Visual Object Classes (VOC) challenge was established in 2005. Organized every year till 2012, it provides a famous benchmark dataset of a wide range of natural images for Image...