Book Image

Exploring Deepfakes

By : Bryan Lyon, Matt Tora
Book Image

Exploring Deepfakes

By: Bryan Lyon, Matt Tora

Overview of this book

Applying Deepfakes will allow you to tackle a wide range of scenarios creatively. Learning from experienced authors will help you to intuitively understand what is going on inside the model. You’ll learn what deepfakes are and what makes them different from other machine learning techniques, and understand the entire process from beginning to end, from finding faces to preparing them, training the model, and performing the final swap. We’ll discuss various uses for face replacement before we begin building our own pipeline. Spending some extra time thinking about how you collect your input data can make a huge difference to the quality of the final video. We look at the importance of this data and guide you with simple concepts to understand what your data needs to really be successful. No discussion of deepfakes can avoid discussing the controversial, unethical uses for which the technology initially became known. We’ll go over some potential issues, and talk about the value that deepfakes can bring to a variety of educational and artistic use cases, from video game avatars to filmmaking. By the end of the book, you’ll understand what deepfakes are, how they work at a fundamental level, and how to apply those techniques to your own needs.
Table of Contents (15 chapters)
1
Part 1: Understanding Deepfakes
6
Part 2: Getting Hands-On with the Deepfake Process
10
Part 3: Where to Now?

Assessing the limitations of generative AI

Generative AIs like those used in deepfakes are not a panacea and actually have some significant limitations. However, by knowing about these limitations, they can generally be worked around or sidestepped with careful design.

Resolution

Deepfakes are limited in the resolution that they can swap. This is a hardware and time limitation: greater hardware and more time can provide higher resolution swaps. However, this is not a 1:1 linear growth. Doubling the resolution (from, say, 64x64 to 128x128) actually quadruples the amount of required VRAM – that is, the memory that a GPU has direct access to – and the time necessary to train is expanded a roughly equivalent amount. Because of this, resolution is often a balancing act, where you’ll want to make the deepfake the lowest resolution you can without sacrificing the results.

Training required for each face pair

To provide the best results, traditional deepfakes require that you train on every face pair that you wish to swap. This means that if you wanted to swap your own face with two of your friends, you’d have to train two separate models. This is because each model has one encoder and two decoders, which are trained only to swap the faces they were given.

There is a workaround to some multi-face swaps. In order to swap additional faces, you could write your own version with more than two decoders allowing you to swap additional faces. This is an imperfect solution, however, as each decoder takes up a significant amount of VRAM, requiring you to balance the number of faces carefully.

It may be better to simply train multiple pairs. By splitting the task up on multiple computers, you could train multiple models simultaneously, allowing you to create many face pairs at once.

Another option is to use a different type of AI face replacement. First Order Model (which is covered in the Looking at existing deepfake software section of this chapter) uses a different technique: instead of a paired approach, it uses AI to animate an image to match the actions of a replacement. This solution removes the need to retrain on each face pair, but comes at the cost of greatly reduced quality of the swap.

Training data

Generative AIs requires a significant amount of training data to accomplish their tasks. Sometimes, finding sufficient data or data of a high-enough quality is not possible. For example, how would someone create a deepfake of William Shakespeare when there are no videos or photographs of him? This is a tricky problem but can be worked around in several ways. While it is unfortunately impossible to create a proper deepfake of England’s greatest playwright, it would be possible to use an actor who looks like his portraits and then deepfake that actor as Shakespeare.

Tip

We will cover more on how to deal with poor or insufficient data in Chapter 3, Mastering Data.

Finding sufficient data (or clever workarounds) is the most difficult challenge that any data scientist faces. Occasionally, there simply is no way to get sufficient data. This is when you might need to re-examine the video to see whether there is another way to shoot it to avoid the lack of data, or you might try using other sources of similar data to patch the gaps. Sometimes, just knowing the limitations in advance can prevent a problem – other times, a workaround in the last minutes may be enough to save a project from failure.

While everyone should know the data limitations, knowing the limitations of the process is only for experts. If you are only looking to use deepfakes, you’ll probably use existing software. Let’s explore those next.