Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By : Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo
Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By: Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Overview of this book

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time! We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation. After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks. Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell

Object detection as classification – Sliding window

Object detection is a different problem to localization as we can have a variable number of objects in the image. Consequently it becomes very tricky to handle variable number of outputs if we consider detection as just a simple regression problem like we did for localization. Therefore we consider detection as a classification problem instead.

One very common approach that has been in use for a long time is to do object detection using sliding windows. The idea is to slide a window of fixed size across the input image. What is inside the window at each location is then sent to a classifier that will tell us if the window contains an object of interest or not.


For this purpose, one can first train a CNN classifier with small closely cropped images - resized to the same size as the window - of objects we want to detect e.g. cars. At test time our fixed size window is moved in a sliding fashion across the whole image that we want to detect...