Book Image

Hands-On Generative Adversarial Networks with Keras

By : Rafael Valle
Book Image

Hands-On Generative Adversarial Networks with Keras

By: Rafael Valle

Overview of this book

Generative Adversarial Networks (GANs) have revolutionized the fields of machine learning and deep learning. This book will be your first step toward understanding GAN architectures and tackling the challenges involved in training them. This book opens with an introduction to deep learning and generative models and their applications in artificial intelligence (AI). You will then learn how to build, evaluate, and improve your first GAN with the help of easy-to-follow examples. The next few chapters will guide you through training a GAN model to produce and improve high-resolution images. You will also learn how to implement conditional GANs that enable you to control characteristics of GAN output. You will build on your knowledge further by exploring a new training methodology for progressive growing of GANs. Moving on, you'll gain insights into state-of-the-art models in image synthesis, speech enhancement, and natural language generation using GANs. In addition to this, you'll be able to identify GAN samples with TequilaGAN. By the end of this book, you will be well-versed with the latest advancements in the GAN framework using various examples and datasets, and you will have developed the skills you need to implement GAN architectures for several tasks and domains, including computer vision, natural language processing (NLP), and audio processing. Foreword by Ting-Chun Wang, Senior Research Scientist, NVIDIA
Table of Contents (14 chapters)
Free Chapter
1
Section 1: Introduction and Environment Setup
4
Section 2: Training GANs
8
Section 3: Application of GANs in Computer Vision, Natural Language Processing, and Audio

What this book covers

Chapter 1, Deep Learning Basics and Environment Setup, contains essential knowledge for building and training deep learning models, including GANs. In this chapter, you will also learn how to set up your deep learning Python and Keras environments for the upcoming projects. Finally, you will learn about the importance of using GPUs in deep learning and how to choose the platform that best suits you.

Chapter 2, Introduction to Generative Models, covers the basics of generative models, including GANs, variational autoencoders, autoregressive models and reversible flow models. You will learn about state-of-the-art applications that use GANs. You will learn the building blocks of GANs, along with their strengths and limitations.

Chapter 3, Implementing Your First GAN, explains the basics of implementing and training a GAN for image synthesis. You will learn how to implement the generator and discriminator in a GAN. You will learn how to implement your loss function and how to use it to train your GAN models. You will learn how to visualize the samples from your first GAN. We will focus on the well-known CIFAR10 dataset, with 60,000 32 by 32 color images in 10 classes, naturally including dogs and cats.

Chapter 4, Evaluating Your First GAN, covers how to use quantitative and qualitative methods to evaluate the quality and variety of the GAN samples you produced in the previous chapter. You will learn about the challenges involved in evaluating GAN samples. You will learn how to implement metrics for image quality. You will learn about using the birthday paradox to evaluate sample variety.

Chapter 5, Improving Your First GAN, explains the main challenges in training and understanding GANs, and how to solve them. You will learn about vanishing gradients, mode collapse, training instability, and other challenges. You will learn how to solve the challenges that arise when training GANs by using tricks of the trade and improving your GAN architecture and your loss function. You will learn about multiple deep learning model architectures that have been successful with the GAN framework. Furthermore, you will learn how to improve your first GAN by implementing new loss functions and algorithms. We will continue to focus on the CIFAR-10 dataset.

Chapter 6, Synthesizing and Manipulating Images with GANs, explains how to implement pix2pixHD: a method for high-resolution (such as 2048 x 1024) photo-realistic image-to-image translation. It can be used to turn semantic label maps into photo-realistic images or to synthesize portraits from face label maps. We will use the Cityscapes dataset, which focuses on a semantic understanding of urban street scenes.

Chapter 7, Progressive Growing of GANs, explains how to implement the progressive growing of GANs framework: a new training methodology in which the generator and discriminator are trained progressively. Starting from a low resolution, we will add new layers that model increasingly fine details as training progresses. This speeds up the training process and stabilizes it, allowing us to produce images of unprecedented quality. We will focus on the CelebFaces Attributes Dataset (CelebA): a face attributes dataset with over 200,000 celebrity images.

Chapter 8, Natural Language Generation with GANs, covers the implementation of adversarial generation of natural language: a model capable of generating sentences multiple languages from context-free and probabilistic context-free grammars. You will learn how to implement a model that generates sequences character by character and a model that generates sentences word by word. We will focus on the Google 1-billion-word dataset.

Chapter 9, Text-To-Image Synthesis with GANs, explains how to implement generative adversarial text to image synthesis: a model that generates plausible images from detailed text descriptions. You will learn about matching-aware discriminators, interpolations in embedding space and vector arithmetic. We will focus on the Oxford-102 Flowers dataset.

Chapter 10, Speech Enhancement with GANs, covers the implementation of a speech enhancement GAN: a framework for audio denoising and speech enhancement using GANs. You will learn how to train the model with multiple speakers and noise conditions. You will learn how to evaluate the model qualitatively and quantitatively. We will focus on the WSJ dataset and a noise dataset.

Chapter 11, TequilaGAN: Identifying GAN Samples, explains how to implement TequilaGAN. You will learn how to identify the underlying characteristics of GAN data and how to identify data to differentiate real data from fake data. You will implement strategies to easily identify fake samples that have been generated with the GAN framework. One strategy is based on the statistical analysis and comparison of raw pixel values and features extracted from them. The other strategy learns formal specifications from real data and shows that fake samples violate the specifications of the real data. We focus on the MNIST dataset of handwritten images, CIFAR-10, and a dataset of Bach Chorales.

Chapter 12, What's Next in GANs?, covers recent advances and open questions that relate to GANs. We start with a summary of this book and what it has covered, from the simplest to the state-of-the-art GANs. Then we address important open questions related to GANs. We also consider the artistic use of GANs in the visual and sonic arts. Finally, we take a look at new and yet-to-be-explored domains with GANs.