Deep Learning Quick Reference

By : Mike Bernico

Deep Learning Quick Reference

By: Mike Bernico

Overview of this book

Deep learning has become an essential necessity to enter the world of artificial intelligence. With this book deep learning techniques will become more accessible, practical, and relevant to practicing data scientists. It moves deep learning from academia to the real world through practical examples. You will learn how Tensor Board is used to monitor the training of deep neural networks and solve binary classification problems using deep learning. Readers will then learn to optimize hyperparameters in their deep learning models. The book then takes the readers through the practical implementation of training CNN's, RNN's, and LSTM's with word embeddings and seq2seq models from scratch. Later the book explores advanced topics such as Deep Q Network to solve an autonomous agent problem and how to use two adversarial networks to generate artificial images that appear real. For implementation purposes, we look at popular Python-based deep learning frameworks such as Keras and Tensorflow, Each chapter provides best practices and safe choices to help readers make the right decision while training deep neural networks. By the end of this book, you will be able to solve real-world problems quickly with deep neural networks.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

The Building Blocks of Deep Learning

The deep neural network architectures

Optimization algorithms for deep learning

Deep learning frameworks

Building datasets for deep learning

Summary

Using Deep Learning to Solve Regression Problems

Regression analysis and deep neural networks

Using deep neural networks for regression

Building an MLP in Keras

Building a deep neural network in Keras

Saving and loading a trained Keras model

Summary

Monitoring Network Training Using TensorBoard

A brief overview of TensorBoard

Setting up TensorBoard

Connecting Keras to TensorBoard

Using TensorBoard

Summary

Using Deep Learning to Solve Binary Classification Problems

Binary classification and deep neural networks

Case study – epileptic seizure recognition

Building a binary classifier in Keras

Using the checkpoint callback in Keras

Measuring ROC AUC in a custom callback

Measuring precision, recall, and f1-score

Summary

Using Keras to Solve Multiclass Classification Problems

Multiclass classification and deep neural networks

Case study - handwritten digit classification

Building a multiclass classifier in Keras

Controlling variance with dropout

Controlling variance with regularization

Summary

Hyperparameter Optimization

Should network architecture be considered a hyperparameter?

Which hyperparameters should we optimize?

Hyperparameter optimization strategies

Summary

Training a CNN from Scratch

Introducing convolutions

Training a convolutional neural network in Keras

Using data augmentation

Summary

Transfer Learning with Pretrained CNNs

Overview of transfer learning

When transfer learning should be used

The impact of source/target volume and similarity

Transfer learning in Keras

Summary

Training an RNN from scratch

Introducing recurrent neural networks

A refresher on time series problems

Using an LSTM for time series prediction

Summary

Training LSTMs with Word Embeddings from Scratch

An introduction to natural language processing

Vectorizing text

Word embedding

Keras embedding layer

1D CNNs for natural language processing

Case studies for document classifications

Summary

Training Seq2Seq Models

Sequence-to-sequence models

Machine translation

Summary

Using Deep Reinforcement Learning

Reinforcement learning overview

The Keras reinforcement learning framework

Building a reinforcement learning agent in Keras

Summary

Generative Adversarial Networks

An overview of the GAN

Deep Convolutional GAN architecture

How GANs can fail

Safe choices for GAN

Generating MNIST images using a Keras GAN

Generating CIFAR-10 images using a Keras GAN

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Deep learning frameworks

While it's most certainly possible to build and train deep neural networks from scratch using just Python's numpy, that would take a great deal of time and code. It's far more practical, in almost every case, to use a deep learning framework.

Throughout this book we will be using TensorFlow and Keras to make developing deep neural networks much easier and faster.

What is TensorFlow?

TensorFlow is a library that can be used to quickly build deep neural networks. In TensorFlow, the mathematical operations that we've covered thus far are expressed as nodes. The edges between these nodes are tensors, or multidimensional data arrays. TensorFlow can, given a neural network defined as a graph and a loss function, automatically compute gradients for the network and optimize the graph to minimize the loss function.

TensorFlow was released as an open source project by Google in 2015. Since then it has gained a very large following and enjoys a large user community. While TensorFlow provides APIs in Java, C++, Go, and Python, we will only be covering the Python API. The Python API is used in this book because it's both the most commonly used, and the API most commonly used for the development of new models.

TensorFlow can greatly accelerate computation by performing those calculations on one or more Graphics Processing Units. The acceleration that GPU computation provides has become a necessity in modern deep learning.

What is Keras?

While building deep neural networks in TensorFlow is far easier than doing it from scratch, TensorFlow is still a very low-level API. Keras is a high-level API that allows us to use TensorFlow (or alternatively Theano or Microsoft's CNTK) to rapidly build deep learning networks.

Models built in Keras and TensorFlow are portable and can be trained or served in native TensorFlow as well. Models constructed in TensorFlow can be loaded into Keras and used there as well.

Popular alternatives to TensorFlow

There are many other great deep learning frameworks out there. We chose Keras and TensorFlow primarily because of popularity, ease of use, availability for support, and readiness for production deployments. There are undoubtedly other worthy alternatives.

Some of my favorites alternatives to TensorFlow include:

Apache MXNet: A very high performance framework with a great new imperative interface called Gluon (https://mxnet.apache.org/)
PyTorch: A very new and promising architecture originally developed by Facebook (http://pytorch.org/)
CNTK: Microsoft's deep learning framework that can also be used with Keras (https://www.microsoft.com/en-us/cognitive-toolkit/)

While I do strongly believe that Keras and TensorFlow are the correct choices for this book, I also want to acknowledge these great frameworks and the contributions to the field that each project has made.

GPU requirements for TensorFlow and Keras

For the remainder of the book, we will be using Keras and TensorFlow. Most of the examples we will be exploring require a GPU for acceleration. Most modern deep learning frameworks, including TensorFlow, use GPUs to greatly accelerate the vast amount of calculations required during network training. Without a GPU, the training time of most of the models we discuss will be unreasonably long.

If you don't have a computer with a GPU installed, GPU-based compute instances can be rented by the second from a variety of cloud providers including Amazon's Amazon Web Services and Google's Google Cloud Platform. For the examples in this book, we will be using a p2.xlarge instance in Amazon EC2 running Ubuntu Server 16.04. The p2.xlarge instance provides an Nvidia Tesla K80 GPU with 2,496 CUDA cores, which will make running the models we show in this book much faster than what is achievable on even very high end desktop computers.

Installing Nvidia CUDA Toolkit and cuDNN

Since you'll likely be using a cloud based solution for your deep learning work, I've included instructions that will get you up and running fast on Ubuntu Linux, which is commonly available across cloud providers. It's also possible to install TensorFlow and Keras on Windows. As of TensorFlow v1.2, TensorFlow unfortunately does not support GPUs on OS X.

Before we can utilize the GPU, the NVidia CUDA Toolkit and cuDNN must be installed. We will be installing CUDA Toolkit 8.0 and cuDNN v6.0, which are recommended for use with TensorFlow v1.4. There is a good chance that a new version will be released before you finish reading this paragraph, so check www.tensorflow.org for the latest required versions.

We will start by installing the build-essential package on Ubuntu, which contains most of what we need to compile C++ programs. The code is given here:

sudo apt-get update
sudo apt-get install build-essential

Next, we can download and install CUDA Toolkit. As previously mentioned, we will be installing version 8.0 and it's associated patch. You can find the CUDA Toolkit that is right for you at https://developer.nvidia.com/cuda-zone.

wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run
sudo sh cuda_8.0.61_375.26_linux-run # Accept the EULA and choose defaults
wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/patches/2/cuda_8.0.61.2_linux-run
sudo sh cuda_8.0.61.2_linux-run # Accept the EULA and choose defaults

The CUDA Toolkit should now be installed in the following path: /usr/local/cuda. You'll need to add a few environment variables so that TensorFlow can find it. You should probably consider adding these environment variables to ~/.bash_profile, so that they're set at every login, as shown in the following code:

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME="/usr/local/cuda"

At this point, you can test that everything is working by executing the following command: nvidia-smi. The output should look similar to this:

$nvidia-smi
+-----------------------------------------------------------------------------+
 | NVIDIA-SMI 375.26 Driver Version: 375.26 |
 |-------------------------------+----------------------+----------------------+
 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
 |===============================+======================+======================|
 | 0 Tesla K80 Off | 0000:00:1E.0 Off | 0 |
 | N/A 41C P0 57W / 149W | 0MiB / 11439MiB | 99% Default |
 +-------------------------------+----------------------+----------------------+

Lastly, we need to install cuDNN, which is the NVIDIA CUDA Deep Neural Network library.

First, download cuDNN to your local computer. To do so, you will need to register as a developer in the NVIDIA Developer Network. You can find cuDNN at the cuDNN homepage at https://developer.nvidia.com/cuDNN. Once you have downloaded it to your local computer, you can use scp to move it to your EC2 instance. While exact instructions will vary by cloud provider you can find additional information about connecting to AWS EC2 via SSH/SCP at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstancesLinux.html.

Once you've moved cuDNN to your EC2 image, you can unpack the file, using the following code:

tar -xzvf cudnn-8.0-linux-x64-v6.0.tgz

Finally, copy the unpacked files to their appropriate locations, using the following code:

sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/* /usr/local/cuda/lib64

It's unclear to me why CUDA and cuDNN are distributed separately and why cuDNN requires registrations. The overly complicated download process and manual installation of cuDNN is really one of the greatest mysteries in deep learning.

Installing Python

We will be using virtualenv to create an isolated Python virtual environment. While this isn't strictly necessary, it's an excellent practice. By doing so, we will keep all our Python libraries for this project in a separate isolated environment that won't interfere with the system Python installation. Additionally, virtualenv environments will make it easier to package and deploy our deep neural networks later on.

Let's start by installing Python, pip, and virtualenv, using the aptitude package manager in Ubuntu. The following is the code:

sudo apt-get install python3-pip python3-dev python-virtualenv

Now we can create a virtual environment for our work. We will be keeping all our virtual environment files in a folder called ~/deep-learn. You are free to choose any name you wish for this virtual environment. The following code shows how to create a virtual environment:

virtualenv --no-site-packages -p python3 ~/deep-learn

If you're an experienced Python developer, you might have noticed that I've set up the environment to default to Python 3.x. That's most certainly not required, and TensorFlow / Keras both support Python 2.7. That said, the author feels a moral obligation to the Python community to support modern versions of Python.

Now that the virtual environment has been created, you can activate it as follows:

$source ~/deep-learn/bin/activate
(deep-learn)$ # notice the shell changes to indicate the virtualenv

At this point, every time you log in you will need to activate the virtual environment you want to work in. If you would like to always enter the virtual environment you just created, you can add the source command to ~/.bash_profile.

Now that we've configured our virtual environment, we can add Python packages as required within it. To start, let's make sure we have the latest version of pip, the Python package manager:

easy_install -U pip

Lastly, I recommend installing IPython, which is an interactive Python shell that makes development much easier.

pip install ipython

And that's it. Now we're ready to install TensorFlow and Keras.

Installing TensorFlow and Keras

After everything we've just been through together, you'll be pleased to see how straightforward installing TensorFlow and Keras now is.

Let's start with installing TensorFlow

The installation of TensorFlow can be done using the following code:

pip install --upgrade tensorflow-gpu

Be sure to pip install tensorflow-gpu. If you pip install TensorfFow (without -gpu), you will install the CPU-only version.

Before we install Keras, let's test our TensorFlow installation. To do this, I'll be using some sample code from the TensorfFow website and the IPython interpreter.

Start the IPython interpreter by typing IPython at the bash prompt. Once IPython has started, let's attempt to import TensorFlow. The output would look like the following:

In [1]: import tensorflow as tf
In [2]:

If importing TensorFlow results in an error, troubleshoot the steps you have followed so far. Most often when TensorFlow cannot be imported, the CUDA or cuDNN might not be installed correctly.

Now that we've successfully installed TensorFlow, we will run a tiny bit of code in IPython that will verify we can run computations on the GPU:

a = tf.constant([1.0,</span> 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(sess.run(c))

If everything goes as we hope, we will see lots of indications that our GPU is being used. I have included some output here and highlighted the evidence to draw your attention to it. Your output will likely be different based on hardware, but you should see similar evidence the one shown here:

/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
: I tensorflow/core/common_runtime/placer.cc:874] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
 b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
: I tensorflow/core/common_runtime/placer.cc:874] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
 a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
: I tensorflow/core/common_runtime/placer.cc:874] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
 [[ 22. 28.]
 [ 49. 64.]]

In the preceding output, we can see that tensors a and b, as well as the matrix multiplication operation, were assigned the the GPU. If there was a problem with accessing the GPU, the output might look as follows:

I tensorflow/core/common_runtime/placer.cc:874] b_1: (Const)/job:localhost/replica:0/task:0/device:CPU:0
a_1: (Const): /job:localhost/replica:0/task:0/device:CPU:0
I tensorflow/core/common_runtime/placer.cc:874] a_1: (Const)/job:localhost/replica:0/task:0/device:CPU:0

Here we can see the tensors b_1 and a_1 were assigned to the CPU rather than the GPU. If this happens there is a problem with your installation of TensorFlow, CUDA, or cuDNN.

If you've made it this far, you have a working installation of TensorFlow. The only remaining task is to install Keras.

The installation of Keras can be done with the help of the following code:

pip install keras

And that's it! Now we're ready to build deep neural networks in Keras and TensorFlow.

This might be a great time to create a snapshot or even an AMI of your EC2 instance, so that you don't have to go through this installation again.

Deep Learning Quick Reference

By : Mike Bernico

Deep Learning Quick Reference

By: Mike Bernico

Overview of this book

Related Content you might be interested in

Current Title:

Deep Learning Quick Reference

Keras Deep Learning Cookbook

Machine Learning for Finance

Deep Learning with Keras