Book Image

Hands-On Computer Vision with TensorFlow 2

By : Benjamin Planche, Eliot Andres
Book Image

Hands-On Computer Vision with TensorFlow 2

By: Benjamin Planche, Eliot Andres

Overview of this book

Computer vision solutions are becoming increasingly common, making their way into fields such as health, automobile, social media, and robotics. This book will help you explore TensorFlow 2, the brand new version of Google's open source framework for machine learning. You will understand how to benefit from using convolutional neural networks (CNNs) for visual tasks. Hands-On Computer Vision with TensorFlow 2 starts with the fundamentals of computer vision and deep learning, teaching you how to build a neural network from scratch. You will discover the features that have made TensorFlow the most widely used AI library, along with its intuitive Keras interface. You'll then move on to building, training, and deploying CNNs efficiently. Complete with concrete code examples, the book demonstrates how to classify images with modern solutions, such as Inception and ResNet, and extract specific content using You Only Look Once (YOLO), Mask R-CNN, and U-Net. You will also build generative adversarial networks (GANs) and variational autoencoders (VAEs) to create and edit images, and long short-term memory networks (LSTMs) to analyze videos. In the process, you will acquire advanced insights into transfer learning, data augmentation, domain adaptation, and mobile and web deployment, among other key concepts. By the end of the book, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0.
Table of Contents (16 chapters)
Free Chapter
Section 1: TensorFlow 2 and Deep Learning Applied to Computer Vision
Section 2: State-of-the-Art Solutions for Classic Recognition Problems
Section 3: Advanced Concepts and New Frontiers of Computer Vision

To get the most out of this book

The following section contains some information and advice to facilitate the reading of this book and to help readers benefit from its supplementary materials.

Download and run the example code files

Practice makes perfect. Therefore, this book not only provides in-depth explanations of TensorFlow 2 and state-of-the-art computer-vision methods, but it also comes with a number of practical examples and complete implementations for each chapter.

Download the code files

You can download the example code files for this book from your account at If you purchased this book elsewhere, you can visit and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at
  2. Select the Support tab.
  3. Click on Code Downloads.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at Check them out!

Study and run the experiments

Jupyter Notebook ( is an open source web application for creating and sharing Python scripts, along with textual information, visual results, equations, and more. We will call Jupyter notebooks the documents provided with the book, containing detailed code, expected results, and supplementary explanations. Each Jupyter notebook is dedicated to a concrete computer vision task. For example, one notebook explains how to train a CNN to detect animals in images, while another details all the steps to build a recognition system for self-driving cars, and so on.

As we will see in this section, these documents can either be studied directly, or they can be used as code recipes to run and reproduce the experiments presented in the book.

Study the Jupyter notebooks online

If you simply want to go through the code and results provided, you can directly access them online in the book's GitHub repository. Indeed, GitHub is able to render Jupyter notebooks and to display them as static web pages.

However, the GitHub viewer ignores some style formatting and interactive content. For the best online viewing experience, we recommend using instead Jupyter nbviewer (, an official web platform you can use to read Jupyter notebooks uploaded online. This website can be queried to render notebooks stored in GitHub repositories. Therefore, the Jupyter notebooks provided can also be read at the following address:

Run the Jupyter notebooks on your machine

To read or run these documents on your machine, you should first install Jupyter Notebook. For those who already use Anaconda ( to manage and deploy their Python environments (as we will recommend in this book), Jupyter Notebook should be directly available (as it is installed with Anaconda). For those using other Python distributions and those not familiar with Jupyter Notebook, we recommend having a look at the documentation, which provides installation instructions and tutorials (

Once Jupyter Notebook is installed on your machine, navigate to the directory containing the book's code files, open a terminal, and execute the following command:

$ jupyter notebook

The web interface should open in your default browser. From there, you should be able to navigate the directory and open the Jupyter notebooks provided, either to read, execute, or edit them.

Some documents contain advanced experiments that can be extremely compute-intensive (such as the training of recognition algorithms over large datasets). Without the proper acceleration hardware (that is, without compatible NVIDIA GPUs, as explained in Chapter 2, TensorFlow Basics and Training a Model), these scripts can take hours or even days (even with compatible GPUs, the most advanced examples can take quite some time).

Run the Jupyter notebooks in Google Colab

For those who wish to run the Jupyter notebooks themselves—or play with new experiments—but do not have access to a powerful enough machine, we recommend using Google Colab, also named Colaboratory ( It is a cloud-based Jupyter environment, provided by Google, for people to run compute-intensive scripts on powerful machines. You will find more details regarding this service in the GitHub repository.

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here:

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, folder names, filenames, file extensions, pathnames, dummy URLs, and user input. Here is an example: "The .fit() method of the Model object starts the training procedure."

A block of code is set as follows:

import tensorflow as tf

x1 = tf.constant([[0, 1], [2, 3]])
x2 = tf.constant(10)
x = x1 * x2

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

neural_network = tf.keras.Sequential(
tf.keras.layers.Dense(10, activation="softmax")])

Any command-line input or output is written as follows:

$ tensorboard --logdir ./logs

Bold: Indicates a new term, an important word, or words that you see on screen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "You can observe the performance of your solution on the Scalars page of TensorBoard."

Warnings or important notes appear like this.
Tips and tricks appear like this.