Book Image

Hands-On Generative Adversarial Networks with Keras

By : Rafael Valle
Book Image

Hands-On Generative Adversarial Networks with Keras

By: Rafael Valle

Overview of this book

Generative Adversarial Networks (GANs) have revolutionized the fields of machine learning and deep learning. This book will be your first step toward understanding GAN architectures and tackling the challenges involved in training them. This book opens with an introduction to deep learning and generative models and their applications in artificial intelligence (AI). You will then learn how to build, evaluate, and improve your first GAN with the help of easy-to-follow examples. The next few chapters will guide you through training a GAN model to produce and improve high-resolution images. You will also learn how to implement conditional GANs that enable you to control characteristics of GAN output. You will build on your knowledge further by exploring a new training methodology for progressive growing of GANs. Moving on, you'll gain insights into state-of-the-art models in image synthesis, speech enhancement, and natural language generation using GANs. In addition to this, you'll be able to identify GAN samples with TequilaGAN. By the end of this book, you will be well-versed with the latest advancements in the GAN framework using various examples and datasets, and you will have developed the skills you need to implement GAN architectures for several tasks and domains, including computer vision, natural language processing (NLP), and audio processing. Foreword by Ting-Chun Wang, Senior Research Scientist, NVIDIA
Table of Contents (14 chapters)
Free Chapter
1
Section 1: Introduction and Environment Setup
4
Section 2: Training GANs
8
Section 3: Application of GANs in Computer Vision, Natural Language Processing, and Audio

Improving the baseline model

In this example, we improve the baseline model without doing any modifications to the architecture. The authors propose changing the optimization problem such that the Discriminator also has access to mismatched pairs of text embeddings and images.

This approach is called the Matching-Aware Discriminator and is designed to separate the error sources in this task. During training, the discriminator has access to real images with proper text and synthetic images with arbitrary text. In this context, the discriminator implicitly has two sources of error: fake images that look real but do not match the text description, and unrealistic images for any text.

In this context, the authors explicitly provide the discriminator with pairs of real images and unmatched texts, and empirically find that this helps during training. We'll provide a slice of the...