Generative AI with Python and TensorFlow 2

By : Joseph Babcock, Raghav Bali

4 (1)

Buy this Book

Generative AI with Python and TensorFlow 2

4 (1)

By: Joseph Babcock, Raghav Bali

Buy this Book

Overview of this book

Machines are excelling at creative human skills such as painting, writing, and composing music. Could you be more creative than generative AI? In this book, you’ll explore the evolution of generative models, from restricted Boltzmann machines and deep belief networks to VAEs and GANs. You’ll learn how to implement models yourself in TensorFlow and get to grips with the latest research on deep neural networks. There’s been an explosion in potential use cases for generative models. You’ll look at Open AI’s news generator, deepfakes, and training deep learning agents to navigate a simulated environment. Recreate the code that’s under the hood and uncover surprising links between text, image, and music generation.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

An Introduction to Generative AI: "Drawing" Data from Models

Applications of AI

The rules of probability

Why use generative models?

Style transfer and image transformation

Unique challenges of generative models

Summary

References

Free Chapter

Setting Up a TensorFlow Lab

Deep neural network development and TensorFlow

VSCode

Docker: A lightweight virtualization solution

Kubernetes: Robust management of multi-container applications

Kubeflow: an end-to-end machine learning lab

A brief tour of Kubeflow's components

Kubeflow pipelines

Using Kubeflow Katib to optimize model hyperparameters

Summary

References

Building Blocks of Deep Neural Networks

Perceptrons – a brain in a function

Multi-layer perceptrons and backpropagation

Varieties of networks: Convolution and recursive

Networks for sequence data

Building a better optimizer

Summary

References

Teaching Networks to Generate Digits

The MNIST database

Restricted Boltzmann Machines: generating pixels with statistical mechanics

Stacking Restricted Boltzmann Machines to generate images: the Deep Belief Network

Creating an RBM using the TensorFlow Keras layers API

Creating a DBN with the Keras Model API

Summary

References

Painting Pictures with Neural Networks Using VAEs

Creating separable encodings of images

The variational objective

Inverse Autoregressive Flow

Importing CIFAR

Creating the network from TensorFlow 2

Summary

References

Image Generation with GANs

The taxonomy of generative models

Generative adversarial networks

Summary

Style Transfer with GANs

Paired style transfer using pix2pix GAN

Unpaired style transfer using CycleGAN

Summary

Deepfakes with GANs

Replacement using autoencoders

Re-enactment using pix2pix

Challenges

Off-the-shelf implementations

Summary

References

The Rise of Methods for Text Generation

Representing text

Text generation and the magic of LSTMs

LSTM variants and convolutions for text

Summary

References

NLP 2.0: Using Transformers to Generate Text

Attention

GPT 1, 2, 3…

Summary

References

Composing Music with Generative Models

Getting started with music generation

Music generation using LSTMs

Music generation using GANs

MuseGAN – polyphonic music generation

Summary

References

Play Video Games with Generative AI: GAIL

Reinforcement learning: Actions, agents, spaces, policies, and rewards

Running GAIL on PyBullet Gym

Summary

References

Emerging Applications in Generative AI

Finding new drugs with generative models

Solving partial differential equations with generative modeling

Few shot learning for creating videos from images

Generating recipes with deep learning

Summary

References

Why subscribe?

Other Books You May Enjoy

Index

Customer Reviews

4 (1)

5 star

4 star

100%

3 star

2 star

1 star

Deep neural network development and TensorFlow

As we will see in Chapter 3, Building Blocks of Deep Neural Networks, a deep neural network in essence consists of matrix operations (addition, subtraction, multiplication), nonlinear transformations, and gradient-based updates computed by using the derivatives of these components.

In the world of academia, researchers have historically often used efficient prototyping tools such as MATLAB³ to run models and prepare analyses. While this approach allows for rapid experimentation, it lacks elements of industrial software development, such as object-oriented (OO) development, that allow for reproducibility and clean software abstractions that allow tools to be adopted by large organizations. These tools also had difficulty scaling to large datasets and could carry heavy licensing fees for such industrial use cases. However, prior to 2006, this type of computational tooling was largely sufficient for most use cases. However, as the datasets being tackled with deep neural network algorithms grew, groundbreaking results were achieved such as:

Image classification on the ImageNet dataset⁴
Large-scale unsupervised discovery of image patterns in YouTube videos⁵
The creation of artificial agents capable of playing Atari video games and the Asian board game GO with human-like skill^{6 7}
State-of-the-art language translation via the BERT model developed by Google⁸

The models developed in these studies exploded in complexity along with the size of the datasets they were applied to (see Table 2.2 to get a sense of the immense scale of some of these models). As industrial use cases required robust and scalable frameworks to develop and deploy new neural networks, several academic groups and large technology companies invested in the development of generic toolkits for the implementation of deep learning models. These software libraries codified common patterns into reusable abstractions, allowing even complex models to be often embodied in relatively simple experimental scripts.

Model Name	Year	# Parameters
AlexNet	2012	61M
YouTube CNN	2012	1B
Inception	2014	5M
VGG-16	2014	138M
BERT	2018	340M
GPT-3	2020	175B

Table 2.2: Number of parameters by model by year

Some of the early examples of these frameworks include Theano,⁹a Python package developed at the University of Montreal, and Torch,¹⁰ a library written in the Lua language that was later ported to Python by researchers at Facebook, and TensorFlow, a C++ runtime with Python bindings developed by Google¹¹.

In this book, we will primarily use TensorFlow 2.0, due to its widespread adoption and its convenient high-level interface, Keras, which abstracts much of the repetitive plumbing of defining routine layers and model architectures.

TensorFlow is an open-source version of an internal tool developed at Google called DistBelief.¹² The DistBelief framework consisted of distributed workers (independent computational processes running on a cluster of machines) that would compute forward and backward gradient descent passes on a network (a common way to train neural networks we will discuss in Chapter 3, Building Blocks of Deep Neural Networks), and send the results to a Parameter Server that aggregated the updates. The neural networks in the DistBelief framework were represented as a Directed Acyclic Graph (DAG), terminating in a loss function that yielded a scalar (numerical value) comparing the network predictions with the observed target (such as image class or the probability distribution over a vocabulary representing the most probable next word in a sentence in a translation model).

A DAG is a software data structure consisting of nodes (operations) and data (edges) where information only flows in a single direction along the edges (thus directed) and where there are no loops (hence acyclic).

While DistBelief allowed Google to productionize several large models, it had limitations:

First, the Python scripting interface was developed with a set of pre-defined layers corresponding to underlying implementations in C++; adding novel layer types required coding in C++, which represented a barrier to productivity.
Secondly, while the system was well adapted for training feed-forward networks using basic Stochastic Gradient Descent (SGD) (an algorithm we will describe in more detail in Chapter 3, Building Blocks of Deep Neural Networks) on large-scale data, it lacked flexibility for accommodating recurrent, reinforcement learning, or adversarial learning paradigms – the latter of which is crucial to many of the algorithms we will implement in this book.
Finally, this system was difficult to scale down – to run the same job, for example, on a desktop with GPUs as well as a distributed environment with multiple cores per machine, and deployment also required a different technical stack.

Jointly, these considerations prompted the development of TensorFlow as a generic deep learning computational framework: one that could allow scientists to flexibly experiment with new layer architectures or cutting-edge training paradigms, while also allowing this experimentation to be run with the same tools on both a laptop (for early-stage work) and a computing cluster (to scale up more mature models), while also easing the transition between research and development code by providing a common runtime for both.

Though both libraries share the concept of the computation graph (networks represented as a graph of operations (nodes) and data (edges)) and a dataflow programming model (where matrix operations pass through the directed edges of a graph and have operations applied to them), TensorFlow, unlike DistBelief, was designed with the edges of the graph being tensors (n-dimensional matrices) and nodes of the graph being atomic operations (addition, subtraction, nonlinear convolution, or queues and other advanced operations) rather than fixed layer operations – this allows for much greater flexibility in defining new computations and even allowing for mutation and stateful updates (these being simply additional nodes in the graph).

The dataflow graph in essence serves as a "placeholder" where data is slotted into defined variables and can be executed on single or multiple machines. TensorFlow optimizes the constructed dataflow graph in the C++ runtime upon execution, allowing optimization, for example, in issuing commands to the GPU. The different computations of the graph can also be executed across multiple machines and hardware, including CPUs, GPUs, and TPUs (custom tensor processing chips developed by Google and available in the Google Cloud computing environment)¹¹, as the same computations described at a high level in TensorFlow are implemented to execute on multiple backend systems.

Because the dataflow graph allows mutable state, in essence, there is also no longer a centralized parameter server as was the case for DistBelief (though TensorFlow can also be run in a distributed manner with a parameter server configuration), since different nodes that hold state can execute the same operations as any other worker nodes. Further, control flow operations such as loops allow for the training of variable-length inputs such as in recurrent networks (see Chapter 3, Building Blocks of Deep Neural Networks). In the context of training neural networks, the gradients of each layer are simply represented as additional operations in the graph, allowing optimizations such as velocity (as in the RMSProp or ADAM optimizers, described in Chapter 3, Building Blocks of Deep Neural Networks) to be included using the same framework rather than modifying the parameter server logic. In the context of distributed training, TensorFlow also has several checkpointing and redundancy mechanisms ("backup" workers in case of a single task failure) that make it suited to robust training in distributed environments.

TensorFlow 2.0

While representing operations in the dataflow graph as primitives allows flexibility in defining new layers within the Python client API, it also can result in a lot of "boilerplate" code and repetitive syntax. For this reason, the high-level API Keras¹⁴ was developed to provide a high-level abstraction; layers are represented using Python classes, while a particular runtime environment (such as TensorFlow or Theano) is a "backend" that executes the layer, just as the atomic TensorFlow operators can have different underlying implementations on CPUs, GPUs, or TPUs. While developed as a framework-agnostic library, Keras has been included as part of TensorFlow's main release in version 2.0. For the purposes of readability, we will implement most of our models in this book in Keras, while reverting to the underlying TensorFlow 2.0 code where it is necessary to implement particular operations or highlight the underlying logic. Please see Table 2.3 for a comparison between how various neural network algorithm concepts are implemented at a low (TensorFlow) or high (Keras) level in these libraries.

Object	TensorFlow implementation	Keras implementation
Neural network layer	Tensor computation	Python layer classes
Gradient calculation	Graph runtime operator	Python optimizer class
Loss function	Tensor computation	Python loss function
Neural network model	Graph runtime session	Python model class instance

Table 2.3: TensorFlow and Keras comparison

To show you the difference between the abstraction that Keras makes versus TensorFlow 1.0 in implementing basic neural network models, let's look at an example of writing a convolutional layer (see Chapter 3, Building Blocks of Deep Neural Networks) using both of these frameworks. In the first case, in TensorFlow 1.0, you can see that a lot of the code involves explicitly specifying variables, functions, and matrix operations, along with the gradient function and runtime session to compute the updates to the networks.

This is a multilayer perceptron in TensorFlow 1.0¹⁵:

X = tf.placeholder(dtype=tf.float64)
Y = tf.placeholder(dtype=tf.float64)
num_hidden=128
# Build a hidden layer
W_hidden = tf.Variable(np.random.randn(784, num_hidden))
b_hidden = tf.Variable(np.random.randn(num_hidden))
p_hidden = tf.nn.sigmoid( tf.add(tf.matmul(X, W_hidden), b_hidden) )
# Build another hidden layer
W_hidden2 = tf.Variable(np.random.randn(num_hidden, num_hidden))
b_hidden2 = tf.Variable(np.random.randn(num_hidden))
p_hidden2 = tf.nn.sigmoid( tf.add(tf.matmul(p_hidden, W_hidden2), b_hidden2) )
# Build the output layer
W_output = tf.Variable(np.random.randn(num_hidden, 10))
b_output = tf.Variable(np.random.randn(10))
p_output = tf.nn.softmax( tf.add(tf.matmul(p_hidden2, W_output), 
           b_output) )
loss = tf.reduce_mean(tf.losses.mean_squared_error(
        labels=Y,predictions=p_output))
accuracy=1-tf.sqrt(loss)
minimization_op = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
feed_dict = {
    X: x_train.reshape(-1,784),
    Y: pd.get_dummies(y_train)
}
with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    for step in range(10000):
        J_value = session.run(loss, feed_dict)
        acc = session.run(accuracy, feed_dict)
        if step % 100 == 0:
            print("Step:", step, " Loss:", J_value," Accuracy:", acc)
            session.run(minimization_op, feed_dict)
    pred00 = session.run([p_output], feed_dict={X: x_test.reshape(-1,784)})

In contrast, the implementation of the same convolutional layer in Keras is vastly simplified through the use of abstract concepts embodied in Python classes, such as layers, models, and optimizers. Underlying details of the computation are encapsulated in these classes, making the logic of the code more readable.

Note also that in TensorFlow 2.0 the notion of running sessions (lazy execution, in which the network is only computed if explicitly compiled and called) has been dropped in favor of eager execution, in which the session and graph are called dynamically when network functions such as call and compile are executed, with the network behaving like any other Python class without explicitly creating a session scope. The notion of a global namespace in which variables are declared with tf.Variable() has also been replaced with a default garbage collection mechanism.

This is a multilayer perceptron layer in Keras¹⁵:

import TensorFlow as tf
from TensorFlow.keras.layers import Input, Dense
from keras.models import Model
l = tf.keras.layers
model = tf.keras.Sequential([
    l.Flatten(input_shape=(784,)),
    l.Dense(128, activation='relu'),
    l.Dense(128, activation='relu'),
    l.Dense(10, activation='softmax')
])
model.compile(loss='categorical_crossentropy', 
              optimizer='adam',
              metrics = ['accuracy'])
model.summary()
model.fit(x_train.reshape(-1,784),pd.get_dummies(y_train),nb_epoch=15,batch_size=128,verbose=1)

Now that we have covered some of the details of what the TensorFlow library is and why it is well-suited to the development of deep neural network models (including the generative models we will implement in this book), let's get started building up our research environment. While we could simply use a Python package manager such as pip to install TensorFlow on our laptop, we want to make sure our process is as robust and reproducible as possible – this will make it easier to package our code to run on different machines, or keep our computations consistent by specifying the exact versions of each Python library we use in an experiment. We will start by installing an Integrated Development Environment (IDE) that will make our research easier – VSCode.

Generative AI with Python and TensorFlow 2

By : Joseph Babcock, Raghav Bali

Generative AI with Python and TensorFlow 2

By: Joseph Babcock, Raghav Bali

Overview of this book

Related Content you might be interested in

Current Title:

Generative AI with Python and TensorFlow 2

Hands-On Image Generation with TensorFlow

Hands-On Generative Adversarial Networks with Keras

Deep Learning with TensorFlow 2 and Keras

Deep neural network development and TensorFlow

TensorFlow 2.0