TensorFlow 2.0 Computer Vision Cookbook

By : Jesús Martínez

TensorFlow 2.0 Computer Vision Cookbook

By: Jesús Martínez

Overview of this book

Computer vision is a scientific field that enables machines to identify and process digital images and videos. This book focuses on independent recipes to help you perform various computer vision tasks using TensorFlow. The book begins by taking you through the basics of deep learning for computer vision, along with covering TensorFlow 2.x’s key features, such as the Keras and tf.data.Dataset APIs. You’ll then learn about the ins and outs of common computer vision tasks, such as image classification, transfer learning, image enhancing and styling, and object detection. The book also covers autoencoders in domains such as inverse image search indexes and image denoising, while offering insights into various architectures used in the recipes, such as convolutional neural networks (CNNs), region-based CNNs (R-CNNs), VGGNet, and You Only Look Once (YOLO). Moving on, you’ll discover tips and tricks to solve any problems faced while building various computer vision applications. Finally, you’ll delve into more advanced topics such as Generative Adversarial Networks (GANs), video processing, and AutoML, concluding with a section focused on techniques to help you boost the performance of your networks. By the end of this TensorFlow book, you’ll be able to confidently tackle a wide range of computer vision problems using TensorFlow 2.x.

Preface

Who this book is for

What this book covers

To get the most out of this book

Code in Action

Download the color images

Conventions used

Sections

Get in touch

Reviews

Chapter 1: Getting Started with TensorFlow 2.x for Computer Vision

Technical requirements

Working with the basic building blocks of the Keras API

Loading images using the Keras API

Loading images using the tf.data.Dataset API

Saving and loading a model

Visualizing a model's architecture

Creating a basic image classifier

Free Chapter

Chapter 2: Performing Image Classification

Technical requirements

Creating a binary classifier to detect smiles

Creating a multi-class classifier to play rock paper scissors

Creating a multi-label classifier to label watches

Implementing ResNet from scratch

Classifying images with a pre-trained network using the Keras API

Classifying images with a pre-trained network using TensorFlow Hub

Using data augmentation to improve performance with the Keras API

Using data augmentation to improve performance with the tf.data and tf.image APIs

Chapter 3: Harnessing the Power of Pre-Trained Networks with Transfer Learning

Technical requirements

Implementing a feature extractor using a pre-trained network

Training a simple classifier on extracted features

Spot-checking extractors and classifiers

Using incremental learning to train a classifier

Fine-tuning a network using the Keras API

Fine-tuning a network using TFHub

Chapter 4: Enhancing and Styling Images with DeepDream, Neural Style Transfer, and Image Super-Resolution

Technical requirements

Implementing DeepDream

Generating your own dreamy images

Implementing Neural Style Transfer

Applying style transfer to custom images

Applying style transfer with TFHub

Improving image resolution with deep learning

Chapter 5: Reducing Noise with Autoencoders

Technical requirements

Creating a simple fully connected autoencoder

Creating a convolutional autoencoder

Denoising images with autoencoders

Spotting outliers using autoencoders

Creating an inverse image search index with deep learning

Implementing a variational autoencoder

Chapter 6: Generative Models and Adversarial Attacks

Technical requirements

Implementing a deep convolutional GAN

Using a DCGAN for semi-supervised learning

Translating images with Pix2Pix

Translating unpaired images with CycleGAN

Implementing an adversarial attack using the Fast Gradient Signed Method

Chapter 7: Captioning Images with CNNs and RNNs

Technical requirements

Implementing a reusable image caption feature extractor

Implementing an image captioning network

Generating captions for your own photos

Implementing an image captioning network on COCO with attention

Chapter 8: Fine-Grained Understanding of Images through Segmentation

Technical requirements

Creating a fully convolutional network for image segmentation

Implementing a U-Net from scratch

Implementing a U-Net with transfer learning

Segmenting images using Mask-RCNN and TensorFlow Hub

Chapter 9: Localizing Elements in Images with Object Detection

Technical requirements

Creating an object detector with image pyramids and sliding windows

Detecting objects with YOLOv3

Training your own object detector with TensorFlow's Object Detection API

Detecting objects using TFHub

Chapter 10: Applying the Power of Deep Learning to Videos

Technical requirements

Detecting emotions in real time

Recognizing actions with TensorFlow Hub

Generating the middle frames of a video with TensorFlow Hub

Performing text-to-video retrieval with TensorFlow Hub

Chapter 11: Streamlining Network Implementation with AutoML

Technical requirements

Creating a simple image classifier with AutoKeras

Creating a simple image regressor with AutoKeras

Exporting and importing a model in AutoKeras

Controlling architecture generation with AutoKeras' AutoModel

Predicting age and gender with AutoKeras

Chapter 12: Boosting Performance

Technical requirements

Using convolutional neural network ensembles to improve accuracy

Using test time augmentation to improve accuracy

Using rank-N accuracy to evaluate performance

Using label smoothing to increase performance

Checkpointing model

Customizing the training process using tf.GradientTape

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Implementing an image captioning network

An image captioning architecture is comprised of an encoder and a decoder. The encoder is a CNN (typically a pre-trained one), which converts input images into numeric vectors. These vectors are then passed, along with text sequences, to the decoder, which is an RNN, that will learn, based on these values, how to iteratively generate each word in the corresponding caption.

In this recipe, we'll implement an image captioner that's been trained on the Flickr8k dataset. We'll leverage the feature extractor we implemented in the Implementing a reusable image caption feature extractor recipe.

Let's begin, shall we?

Getting ready

The external dependencies we'll be using in this recipe are Pillow, nltk, and tqdm. You can install them all at once with the following command:

$> pip install Pillow nltk tqdm

We will use the Flickr8k dataset, which you can get from Kaggle: https://www.kaggle.com/adityajn105...

TensorFlow 2.0 Computer Vision Cookbook

By : Jesús Martínez

TensorFlow 2.0 Computer Vision Cookbook

By: Jesús Martínez

Overview of this book

Related Content you might be interested in

Current Title:

TensorFlow 2.0 Computer Vision Cookbook

Python Deep Learning Cookbook

TensorFlow 2.0 Quick Start Guide

The TensorFlow Workshop

Implementing an image captioning network

Getting ready