Book Image

Deep Learning with Theano

By : Christopher Bourez

Book Image

Deep Learning with Theano

By: Christopher Bourez

Overview of this book

This book offers a complete overview of Deep Learning with Theano, a Python-based library that makes optimizing numerical expressions and deep learning models easy on CPU or GPU. The book provides some practical code examples that help the beginner understand how easy it is to build complex neural networks, while more experimented data scientists will appreciate the reach of the book, addressing supervised and unsupervised learning, generative models, reinforcement learning in the fields of image recognition, natural language processing, or game strategy. The book also discusses image recognition tasks that range from simple digit recognition, image classification, object localization, image segmentation, to image captioning. Natural language processing examples include text generation, chatbots, machine translation, and question answering. The last example deals with generating random data that looks real and solving games such as in the Open-AI gym. At the end, this book sums up the best -performing nets for each task. While early research results were based on deep stacks of neural layers, in particular, convolutional layers, the book presents the principles that improved the efficiency of these architectures, in order to help the reader build new custom nets.

Deep Learning with Theano

Deep Learning with Theano

Credits

About the Author

About the Author

Acknowledgments

Acknowledgments

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Customer Feedback

Customer Feedback

Preface

Free Chapter

Theano Basics

The need for tensors

Installing and loading Theano

Graphs and symbolic computing

Operations on tensors

Memory and variables

Functions and automatic differentiation

Loops in symbolic computing

Configuration, profiling and debugging

Classifying Handwritten Digits with a Feedforward Network

Classifying Handwritten Digits with a Feedforward Network

The MNIST dataset

Structure of a training program

Classification loss function

Single-layer linear model

Cost function and errors

Backpropagation and stochastic gradient descent

Multiple layer model

Convolutions and max layers

Optimization and other update rules

Related articles

Encoding Word into Vector

Encoding Word into Vector

Encoding and embedding

Continuous Bag of Words model

Training the model

Visualizing the learned embeddings

Evaluating embeddings – analogical reasoning

Evaluating embeddings – quantitative analysis

Application of word embeddings

Further reading

Generating Text with a Recurrent Neural Net

Generating Text with a Recurrent Neural Net

A dataset for natural language

Simple recurrent network

Metrics for natural language performance

Training loss comparison

Example of predictions

Applications of RNN

Related articles

Analyzing Sentiment with a Bidirectional LSTM

Analyzing Sentiment with a Bidirectional LSTM

Installing and configuring Keras

Preprocessing text data

Designing the architecture for the model

Compiling and training the model

Evaluating the model

Saving and loading the model

Running the example

Further reading

Locating with Spatial Transformer Networks

Locating with Spatial Transformer Networks

MNIST CNN model with Lasagne

A localization network

Unsupervised learning with co-localization

Region-based localization networks

Further reading

Classifying Images with Residual Networks

Classifying Images with Residual Networks

Natural image datasets

Residual connections

Stochastic depth

Dense connections

Data augmentation

Further reading

Translating and Explaining with Encoding – decoding Networks

Translating and Explaining with Encoding – decoding Networks

Sequence-to-sequence networks for natural language processing

Seq2seq for translation

Seq2seq for chatbots

Improving efficiency of sequence-to-sequence network

Deconvolutions for images

Multimodal deep learning

Further reading

Selecting Relevant Inputs or Memories with the Mechanism of Attention

Selecting Relevant Inputs or Memories with the Mechanism of Attention

Differentiable mechanism of attention

Store and retrieve information in Neural Turing Machines

Memory networks

Further reading

Predicting Times Sequences with Advanced RNN

Predicting Times Sequences with Advanced RNN

Dropout for RNN

Deep approaches for RNN

Stacked recurrent networks

Deep transition recurrent network

Highway networks design principle

Recurrent Highway Networks

Further reading

Learning from the Environment with Reinforcement

Learning from the Environment with Reinforcement

Reinforcement learning tasks

Simulation environments

Training stability

Policy gradients with REINFORCE algorithms

Related articles

Learning Features with Unsupervised Generative Networks

Learning Features with Unsupervised Generative Networks

Generative models

Semi-supervised learning

Further reading

Extending Deep Learning with Theano

Extending Deep Learning with Theano

Theano Op in Python for CPU

Theano Op in Python for the GPU

Theano Op in C for CPU

Theano Op in C for GPU

Coalesced transpose via shared memory, NVIDIA parallel for all

The future of artificial intelligence

Further reading

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Index

A

AdaDelta / Optimization and other update rules
Adagrad / Optimization and other update rules
Adam / Optimization and other update rules
AlphaGo / Q-learning
analogical reasoning / Evaluating embeddings – analogical reasoning
architecture
- designing, for model / Designing the architecture for the model
- vector representations, of words / Vector representations of words
- sentence representation, using bi-LSTM / Sentence representation using bi-LSTM
- outputting probabilities, with softmax classifier / Outputting probabilities with the softmax classifier
artificial intelligence
- future / The future of artificial intelligence
Association for Computational Linguistics (ACL) / Seq2seq for translation
asynchronous gradient descent / Training stability
attention mechanism
- differentiable / Differentiable mechanism of attention
- about / Differentiable mechanism of attention
- translations / Better translations with attention mechanism
- annotate images / Better annotate images with attention mechanism
auto encoder / Deep belief bets
automatic differentiation / Functions and automatic differentiation

B

backpropagation / Backpropagation and stochastic gradient descent
Backpropagation Through Time (BPTT) / Simple recurrent network
Basic Linear Algebra Subprograms (BLAS) / Theano Op in Python for the GPU
batch normalization / Batch normalization
batch normalization layer / Batch normalization
Bayes Network theory / Dropout for RNN
beam search algorithm / Improving efficiency of sequence-to-sequence network
broadcasting / Elementwise operators

C

Character Error Rate (CER) / Metrics for natural language performance
coalesced transpose
- via shared memory / Coalesced transpose via shared memory, NVIDIA parallel for all
- via NVIDIA parallel / Coalesced transpose via shared memory, NVIDIA parallel for all
- model conversions / Model conversions
Conditional Random Fields (CRF) / Deconvolutions for images
Continuous Bag of Words (CBOW) / Encoding and embedding
continuous Bag of Words model / Continuous Bag of Words model
controller / Store and retrieve information in Neural Turing Machines
Convolutional Neural Network (CNN) / Encoding and embedding
convolutions / Convolutions and max layers
cost function / Cost function and errors
CUDA
- URL, for downloading / GPU drivers and libraries

D

data augmentation / Data augmentation
dataset / Dataset
- for natural language / A dataset for natural language
- character level / A dataset for natural language
- word level / A dataset for natural language
DeConvNet / Deconvolutions for images
deep belief bets / Deep belief bets
deep belief network (DBN) / Deep belief bets
Deeplearning.net Theano
- references / Related articles
DeepMask network / Deconvolutions for images
DeepMind algorithm / Q-learning
Deep Q-network / Deep Q-network
deep transition network / Deep approaches for RNN
deep transition recurrent network / Deep transition recurrent network
dense connections / Dense connections
dimension manipulation operators / Dimension manipulation operators
dropout / Dropout

E

elementwise operators / Elementwise operators
embedding / Encoding and embedding
encoding / Encoding and embedding
episodic memory
- with dynamic memory networks / Episodic memory with dynamic memory networks
errors / Cost function and errors
external memory bank / Store and retrieve information in Neural Turing Machines

F

functions / Functions and automatic differentiation

G

gated recurrent network / Gated recurrent network
GEneral Matrix to Matrix Multiplication (GEMM) / Theano Op in Python for the GPU
General Matrix to Vector Multiplication (GEMV) / Theano Op in Python for the GPU
generative adversarial networks (GANs)
- about / Generative adversarial networks
- improving / Improve GANs
generative model
- about / Generative models
- Restricted Boltzmann Machine / Restricted Boltzmann Machines
- deep belief bets / Deep belief bets
- generative adversarial networks (GANs) / Generative adversarial networks
global average pooling / Global average pooling
graphs / Graphs and symbolic computing
greedy approach / Deep Q-network

H

highway networks design principle / Highway networks design principle
host code / Theano Op in C for GPU

I

identity connection / Residual connections
identity connections / Highway networks design principle
images
- deconvolutions / Deconvolutions for images
Inceptionism / Deconvolutions for images
Independent Component Analysis (ICA) / Visualizing the learned embeddings
inference / Inference
internal covariate shift / Batch normalization
Intersection over Union (IOU) / Region-based localization networks

K

Keras
- installing / Installing and configuring Keras
- configuring / Installing and configuring Keras
- programming / Programming with Keras
- SemEval 2013 dataset / SemEval 2013 dataset
- model, training / Compiling and training the model
- model, compiling / Compiling and training the model
kernel / Theano Op in C for GPU

L

Lasagne
- MNIST CNN model / MNIST CNN model with Lasagne
Latent Sementic Analysis / Indexing (LSA / LSI) / Visualizing the learned embeddings
layer input normalization / Batch normalization
learned embeddings
- visualizing / Visualizing the learned embeddings
linear algebra operators / Linear algebra operators
Linear Discriminant Analysis (LDA) / Visualizing the learned embeddings
localization network
- about / A localization network
- recurrent neural net, applied to images / Recurrent neural net applied to images
Locally Linear Embedding (LLE) / Visualizing the learned embeddings
Long Short-Term Memory (LSTM) / Sentence representation using bi-LSTM
loops
- in symbolic computing / Loops in symbolic computing
loss comparison
- training / Training loss comparison
loss function
- classification / Classification loss function
LSTM network / LSTM network

M

max layers / Convolutions and max layers
memory / Memory and variables
memory networks
- about / Memory networks
- episodic memory, with dynamic memory networks / Episodic memory with dynamic memory networks
MNIST CNN model
- with Lasagne / MNIST CNN model with Lasagne
MNIST dataset / The MNIST dataset
model
- training / Training the model
- compiling, in Keras / Compiling and training the model
- training, in Keras / Compiling and training the model
- evaluating / Evaluating the model
- loading / Saving and loading the model
- saving / Saving and loading the model
- example, executing / Running the example
model collapse / Improve GANs
Modified National Institute of Standards and Technology (MNIST) / The MNIST dataset
momentum / Optimization and other update rules
Monte Carlo Tree Search (MCTS) / Q-learning
multi-GPU / Multi-GPU
multi-layer perceptron (MLP) / Multiple layer model
Multi Dimensional Scaling (MDS) / Visualizing the learned embeddings
multimodal deep learning / Multimodal deep learning
multiple layer model / Multiple layer model

N

natural image datasets
- about / Natural image datasets
- batch normalization / Batch normalization
- global average pooling / Global average pooling
natural language performance
- metrics for / Metrics for natural language performance
Natural Language Processing (NLP) / Sequence-to-sequence networks for natural language processing
negative particles / Restricted Boltzmann Machines
Nesterov Accelerated Gradient / Optimization and other update rules
network input normalization / Batch normalization
Neural Machine Translation (NMT) / Weight tying
Neural Network Language Models (NNLM) / Weight tying
Neural Turing Machines (NTM)
- retrieve information in / Store and retrieve information in Neural Turing Machines
- store information in / Store and retrieve information in Neural Turing Machines
- about / Store and retrieve information in Neural Turing Machines

O

off-policy training / Training stability
Online training / Training stability
Open-AI Gym
- about / Simulation environments
- URL / Simulation environments
optimal state value function v(s) / Q-learning
optimization / Optimization and other update rules
out-of-vocabulary (OOV) / Preprocessing text data

P

Part of Speech (POS) / Applications of RNN
Platoon
- reference link / Multi-GPU
policy gradients (PG)
- about / Policy gradients with REINFORCE algorithms
- with REINFORCE algorithms / Policy gradients with REINFORCE algorithms
policy network / Policy gradients with REINFORCE algorithms
positive and negative phases / Restricted Boltzmann Machines
predictions
- example / Example of predictions
Principal Component Analysis (PCA) / Visualizing the learned embeddings

Q

Q-learning / Q-learning
quantitative analysis / Evaluating embeddings – quantitative analysis

R

recurrent highway networks (RHN) / Recurrent Highway Networks
Recurrent Neural Network (RNN) / Encoding and embedding
recurrent neural networks (RNN)
- need for / Need for RNN
- about / Need for RNN
- applications / Applications of RNN
reduction operators / Reduction operators
region-based localisation networks / Region-based localization networks
Region Proposal Network (RPN) / Region-based localization networks
reinforcement learning tasks / Reinforcement learning tasks
replay memory / Training stability
residual block / Residual connections
residual connections / Residual connections
residuals / Residual connections
Restricted Boltzmann Machine / Restricted Boltzmann Machines
RMSProp / Optimization and other update rules
RNN
- dropout / Dropout for RNN
- deep approaches / Deep approaches for RNN

S

SegNet network / Deconvolutions for images
semi-supervised learning / Semi-supervised learning
sequence-to-sequence (Seq2seq) network
- for natural language processing / Sequence-to-sequence networks for natural language processing
- about / Sequence-to-sequence networks for natural language processing, Seq2seq for translation
- for translation / Seq2seq for translation
- for chatbots / Seq2seq for chatbots
- efficiency, improving / Improving efficiency of sequence-to-sequence network
SharpMask / Deconvolutions for images
simple recurrent network
- about / Simple recurrent network
- LSTM network / LSTM network
- gated recurrent network / Gated recurrent network
simulation environments / Simulation environments
single-layer linear model / Single-layer linear model
Single Instruction Multiple Data (SIMD) / Theano Op in C for GPU
spatial transformer networks (STN) / A localization network
stability
- training / Training stability
stacked recurrent networks / Stacked recurrent networks
state-action value network / Deep Q-network
state value network / Policy gradients with REINFORCE algorithms
state values / Q-learning
stochastic depth / Stochastic depth
stochastic gradient descent (SGD) / Backpropagation and stochastic gradient descent, Optimization and other update rules
Streaming Multiprocessors (SM) / Theano Op in C for GPU
symbolic computing / Graphs and symbolic computing
- loops in / Loops in symbolic computing

T

t-distributed Stochastic Neighbor Embedding (t-SNE) / Visualizing the learned embeddings
Tensor Processing Units (TPU) / Model conversions
tensors
- need for / The need for tensors
- about / Tensors
- operations on / Operations on tensors
- dimension manipulation operators / Dimension manipulation operators
- elementwise operators / Elementwise operators
- reduction operators / Reduction operators
- linear algebra operators / Linear algebra operators
text data
- preprocessing / Preprocessing text data
Theano
- installing / Installing and loading Theano
- loading / Installing and loading Theano
- conda package / Conda package and environment manager
- environment manager / Conda package and environment manager
- installing, on CPU / Installing and running Theano on CPU
- executing, on CPU / Installing and running Theano on CPU
- GPU drivers / GPU drivers and libraries
- GPU libraries / GPU drivers and libraries
- installing, on GPU / Installing and running Theano on GPU
- executing, on GPU / Installing and running Theano on GPU
- debugging / Configuration, profiling and debugging
- profiling / Configuration, profiling and debugging
- configuration / Configuration, profiling and debugging
Theano Op
- in Python, for CPU / Theano Op in Python for CPU
- in Python, for GPU / Theano Op in Python for the GPU
- in C, for CPU / Theano Op in C for CPU
- in C for GPU / Theano Op in C for GPU
Torcs
- URL / Simulation environments
training program
- structure / Structure of a training program
- script environment, setting up / Structure of a training program
- data, loading / Structure of a training program
- data, preprocessing / Structure of a training program
- model, building / Structure of a training program
- training / Structure of a training program

U

unsupervised learning
- with co-localisation / Unsupervised learning with co-localization
update rules / Optimization and other update rules

V

validation dataset
- training / Training
variables / Memory and variables
variational RNN / Dropout for RNN

W

weight tying (WT) / Weight tying
word embeddings
- application / Application of word embeddings
Word Error Rate (WER) / Metrics for natural language performance

Y

You Only See Once (YOLO) architecture / Region-based localization networks