Book Image

Generative Adversarial Networks Cookbook

By : Josh Kalin
Book Image

Generative Adversarial Networks Cookbook

By: Josh Kalin

Overview of this book

Developing Generative Adversarial Networks (GANs) is a complex task, and it is often hard to find code that is easy to understand. This book leads you through eight different examples of modern GAN implementations, including CycleGAN, simGAN, DCGAN, and 2D image to 3D model generation. Each chapter contains useful recipes to build on a common architecture in Python, TensorFlow and Keras to explore increasingly difficult GAN architectures in an easy-to-read format. The book starts by covering the different types of GAN architecture to help you understand how the model works. This book also contains intuitive recipes to help you work with use cases involving DCGAN, Pix2Pix, and so on. To understand these complex applications, you will take different real-world data sets and put them to use. By the end of this book, you will be equipped to deal with the challenges and issues that you may face while working with GAN models, thanks to easy-to-follow code solutions that you can implement right away.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
About Packt
Dedication
Contributors
Preface
Dedication2
Index

GAN pieces come together in different ways


We have explored a few simple GAN structures; we are going to look at seven different styles of GANs in this book. The important thing to realize about the majority of these papers is that the changes occur on the generator and the loss functions.

How to do it...

The generator is going to be producing the images or output, and the loss function will drive the training process to optimize different functions. In practice, what types of variation will there be? Glad you're here. Let's take a brief look at the different architectures.

How it works...

Let's discuss the simplest concept to understand with GANs: style transfer. This type of methodology manifests itself in many different variations, but one of the things I find fascinating is that the architecture of the GAN needs to change based on the specific type of transfer that needs to occur. For instance, one of the papers coming out of Adobe Research Labs focuses on makeup application and removal. Can you apply the same style of makeup as seen in a photo to a photo of another person? The architecture itself is actually rather advanced to make this happen in a realistic fashion, as seen by the architecture diagram:

This particular architecture is one of the most advanced to date-there are five separate loss functions! One of the interesting things about this architecture is that it is able to simultaneously learn a makeup application and makeup removal function. Once the GAN understands how to apply the makeup, it already has a source image to remove the makeup. Along with the five loss functions, the generator is fairly unique in its construction, as given by the following diagram:

 

So, why does this even matter? One of the recipes we are going to cover is style transfer, and you'll see during that particular recipe that our GAN model won't be this advanced. Why is that? In constructing a realistic application of makeup, it takes additional loss functions to appropriately tune the model into fooling the discriminator. In the case of transferring a painter's style, it is easier to transfer a uniform style than multiple disparate makeup styles, like you would see in the preceding data distribution.