Book Image

Natural Language Processing with TensorFlow

By : Motaz Saad, Thushan Ganegedara
Book Image

Natural Language Processing with TensorFlow

By: Motaz Saad, Thushan Ganegedara

Overview of this book

Natural language processing (NLP) supplies the majority of data available to deep learning applications, while TensorFlow is the most important deep learning framework currently available. Natural Language Processing with TensorFlow brings TensorFlow and NLP together to give you invaluable tools to work with the immense volume of unstructured data in today’s data streams, and apply these tools to specific NLP tasks. Thushan Ganegedara starts by giving you a grounding in NLP and TensorFlow basics. You'll then learn how to use Word2vec, including advanced extensions, to create word embeddings that turn sequences of words into vectors accessible to deep learning algorithms. Chapters on classical deep learning algorithms, like convolutional neural networks (CNN) and recurrent neural networks (RNN), demonstrate important NLP tasks as sentence classification and language generation. You will learn how to apply high-performance RNN models, like long short-term memory (LSTM) cells, to NLP tasks. You will also explore neural machine translation and implement a neural machine translator. After reading this book, you will gain an understanding of NLP and you'll have the skills to apply TensorFlow in deep learning NLP applications, and how to perform specific NLP tasks.
Table of Contents (16 chapters)
Natural Language Processing with TensorFlow
Contributors
Preface
Index

Index

A

  • Amazon Web Services (AWS)
    • URL / Installing TensorFlow
  • Anaconda
    • references / Installing Python and scikit-learn
  • Application Programming Interface (API)
    • reference / What is TensorFlow?
  • arguments, loss function
    • logits / Using the TensorFlow RNN API
    • targets / Using the TensorFlow RNN API
    • weights / Using the TensorFlow RNN API
    • average_across_timesteps / Using the TensorFlow RNN API
    • average_across_batch / Using the TensorFlow RNN API
  • Artificial General Intelligence (AGI)
    • about / Towards Artificial General Intelligence
    • MultiModel / One Model to Learn Them All
    • joint many-task model / A joint many-task model – growing a neural network for multiple NLP tasks
  • attention matrix
    • about / Visualizing attention for source and target sentences
  • attention mechanism
    • about / Attention
    • context vector bottleneck, breaking / Breaking the context vector bottleneck
    • implementation / The attention mechanism in detail, Implementing the attention mechanism
    • weights, defining / Defining weights
    • attention, computing / Computing attention
    • translation results / Some translation results – NMT with attention
    • attention, visualizing for source and target sentences / Visualizing attention for source and target sentences
  • Automatic Language Processing Advisory Committee (ALPAC)
    • about / Rule-based translation

B

  • Backpropagation Through Time (BPTT)
    • about / Backpropagation Through Time
    • working / How backpropagation works
    • direct use, limitations / Why we cannot use BP directly for RNNs
    • RNNs, training / Backpropagation Through Time – training RNNs
    / How LSTMs solve the vanishing gradient problem
  • Bag-of-Words (BOW) representation / Input representation
  • Bayes' rule / Bayes' rule
  • beam search
    • about / Beam search, Improving LSTMs – beam search
    • implementing / Implementing beam search
    • examples generated / Examples generated with beam search
  • bidirectional LSTMs (BiLSTM)
    • about / Bidirectional LSTMs (BiLSTM)
  • Bigger Analogy Test Set (BATS)
    • reference / Performance comparison
  • BLEU-4
    • about / BLEU-4 over time for our model
  • BLEU score
    • about / The BLEU score – evaluating the machine translation systems
    • modified precision / Modified precision
    • brevity penalty / Brevity penalty
    • calculating / The final BLEU score

C

  • captions
    • preparing, for feeding into LSTMs / Preparing captions for feeding into LSTMs
    • generating, for test images / Captions generated for test images
  • CBOW
    • comparing / Comparing the CBOW and its extensions
    • extending / More recent algorithms extending skip-gram and CBOW
  • CBOW(Unigram) / Comparing the CBOW and its extensions
  • CBOW (Unigram+Subsampling) / Comparing the CBOW and its extensions
  • Central Processing Units (CPUs) / History of deep learning
  • chatbot
    • about / Other applications of Seq2Seq models – chatbots
    • training / Training a chatbot
    • evaluating / Evaluating chatbots – Turing test
  • CNNs
    • used, for extracting image features / Extracting image features with CNNs
  • co-occurrence matrix / Co-occurrence matrix
  • comparison operations / Comparison operations
  • Compute Unified Device Architecture (CUDA) / The roadmap – beyond this chapter, What is TensorFlow?
  • concept / GloVe – Global Vectors representation
  • conditional probability / Conditional probability
  • consensus-based Image Description Evaluation (CIDEr)
    • about / CIDEr
  • Continuous Bag-of-Words (CBOW) model / Generating text with Word2vec
    • about / Learning word embeddings
  • Continuous Bag-Of-Words algorithm / The Continuous Bag-of-Words algorithm
    • implementing, in TensorFlow / Implementing CBOW in TensorFlow
  • continuous random variables / Continuous random variables
  • continuous window model / The continuous window model
  • Convolution Neural Network (CNN) / The current state of deep learning and NLP
  • Convolution Neural Networks (CNN)
    • about / Introducing Convolution Neural Networks, Understanding Convolution Neural Networks
    • fundamentals / CNN fundamentals
    • importance / The power of Convolution Neural Networks
    • filter size / Understanding Convolution Neural Networks
    • stride / Understanding Convolution Neural Networks
    • padding / Understanding Convolution Neural Networks
    • operation / Convolution operation
    • fully connected layers / Fully connected layers
    • summarizing / Putting everything together
    • used, for image classification on MNIST / Exercise – image classification on MNIST with CNN, About the data
    • MNIST dataset / About the data
    • implementing / Implementing the CNN
    • produced predictions / Analyzing the predictions produced with a CNN
    • used, for sentence classification / Using CNNs for sentence classification
  • Convolution Neural Networks (CNN) structure
    • about / CNN structure
    • data transformation / Data transformation
    • convolution operation / The convolution operation
  • Convolution Neural Networks (CNNs) / Document classification with Word2vec
  • convolution operation
    • about / Standard convolution operation
    • stride, using / Convolving with stride
    • padding, using / Convolving with padding
    • transposed convolution / Transposed convolution
  • current trends, in NLP
    • word embeddings / Word embeddings
    • Neural Machine Translation (NMT) / Neural Machine Translation (NMT)

D

  • data
    • preloading, as tensors / Preloading and storing data as tensors
    • storing, as tensors / Preloading and storing data as tensors
    • about / Our data
    • preprocessing / Preprocessing data
    • generating, for LSTMs / Generating data for LSTMs
  • data preparation, NMT system
    • about / Preparing data for the NMT system
    • training data / At training time
    • source sentence, reversing / Reversing the source sentence
    • testing time / At testing time
  • dataset
    • about / About the dataset
    • text snippet / About the dataset
  • data structures
    • scalar / Scalar
    • vectors / Vectors
    • matrix / Matrices
  • deconvolution / Transposed convolution
  • deep learning approach
    • to Natural Language Processing (NLP) / The deep learning approach to Natural Language Processing
    • history / History of deep learning
    • about / The current state of deep learning and NLP
  • diagonal matrix / Diagonal matrix
  • Dilated Recurrent Neural Networks (DRNNs)
    • about / Newer machine learning models, Dilated Recurrent Neural Networks (DRNNs)
  • discrete random variables / Discrete random variables
  • document classification, with Word2vec
    • about / Document classification with Word2vec
    • dataset / Dataset
  • documents
    • classifying, with word embeddings / Classifying documents with word embeddings
  • Dynamic-Series Time Structure (DSTS) / Detecting rumors in social media

E

  • embedded documents
    • document clustering / Document clustering and t-SNE visualization of embedded documents
    • t-SNE visualization / Document clustering and t-SNE visualization of embedded documents
  • ensemble embedding
    • about / Ensemble embedding
  • EOS / Preparing captions for feeding into LSTMs

F

  • feed-forward neural networks
    • problem / The problem with feed-forward neural networks
  • frame nodes
    • about / Language grounding
  • Fully Connected Neural Network (FCNN) / Understanding a simple deep model – a Fully-Connected Neural Network

G

  • Gated Recurrent Units (GRUs)
    • about / Gated Recurrent Units (GRUs)
    • review / Review
    • code / The code
    • example generated text / Example generated text
  • gather operations / Scatter and gather operations
  • Gaussian Integral
    • reference / The probability mass/density function
  • Generative Adversarial Models (GANs) / Hybrid MT models
  • Generative Adversarial Networks, for NLP
    • about / Generative Adversarial Networks for NLP
  • GloVe
    • about / GloVe – Global Vectors representation
    • example / Understanding GloVe
    • implementing / Implementing GloVe
  • GloVe word vectors
    • loading / Loading GloVe word vectors, Cleaning data
    • URL / Loading GloVe word vectors
  • Google analogy dataset
    • reference / Performance comparison
  • Google Cloud Platform (GCP)
    • URL / Installing TensorFlow
  • Google Neural Machine Translation (GNMT) system / Improving NMTs
  • Graphical Processing Units (GPUs) / History of deep learning
  • Graphical User Interface (GUI)
    • reference / Tour of WordNet
  • greedy sampling
    • about / Greedy sampling
  • Group Method of Data Handling (GMDH) / History of deep learning
  • GRUs
    • about / Gated Recurrent Units

H

  • Hidden Markov Model (HMM) / Example – generating football game summaries
  • hierarchical softmax / Hierarchical softmax
  • hierarchy
    • learning / Learning the hierarchy
    • initializing / Learning the hierarchy
    • WordNet, determining / Learning the hierarchy
  • history, machine translation (MT)
    • about / A brief historical tour of machine translation
    • rule-based translation / Rule-based translation
    • Statistical Machine Translation (SMT) / Statistical Machine Translation (SMT)
    • Neural Machine Translation (NMT) / Neural Machine Translation (NMT)
  • Holonyms / Tour of WordNet
  • hypernyms / Tour of WordNet
  • hyperparameters
    • defining / Defining hyperparameters
    • num_nodes / Defining hyperparameters, Defining the encoder and the decoder
    • batch_size / Defining hyperparameters, Defining the encoder and the decoder
    • num_unrollings / Defining hyperparameters
    • dropout / Defining hyperparameters
    • dec_num_unrollings / Defining the encoder and the decoder
    • embedding_size / Defining the encoder and the decoder
  • hyponyms / Tour of WordNet

I

  • identity matrix / Identity matrix
  • ILSVRC ImageNet dataset
    • URL / Getting to know the data
    • about / ILSVRC ImageNet dataset
  • image caption generation
    • machine learning pipeline / The machine learning pipeline for image caption generation
  • image caption generation pipeline
    • about / The machine learning pipeline for image caption generation
  • image features
    • extracting, with CNNs / Extracting image features with CNNs
  • ImageNet Large Scale Visual Recognition Challenge (ILSVRC) / The power of Convolution Neural Networks
  • improved skip-gram algorithm
    • versus original skip-gram algorithm / Comparing the original skip-gram with the improved skip-gram
  • inferring
    • about / Inferring VGG-16
  • information technology (IT)
    • about / Topic embedding
  • input and output placeholders
    • is_train_text / Defining inputs and outputs
    • train_inputs / Defining inputs and outputs
    • train_labels / Defining inputs and outputs
  • input gate parameters
    • ix / Defining parameters
    • im / Defining parameters
    • ib / Defining parameters
  • inputs
    • about / Inputs, variables, outputs, and operations
    • defining / Defining inputs in TensorFlow
    • data, feeding with Python code / Feeding data with Python code
    • pipeline, building / Building an input pipeline
  • insertion phase
    • about / Statistical Machine Translation (SMT)

J

  • joint many-task model
    • about / A joint many-task model – growing a neural network for multiple NLP tasks, Third level – semantic-level tasks
    • word-based tasks / First level – word-based tasks
    • syntactic tasks / Second level – syntactic tasks
    • semantic-level tasks / Third level – semantic-level tasks
  • joint probability / Joint probability
  • Jupyter Notebook
    • URL, for installing / Installing Jupyter Notebook

K

  • K-means
    • documents, clustering / Implementation – clustering/classification of documents with K-means
    • documents, classifying / Implementation – clustering/classification of documents with K-means
  • Keras
    • about / Introduction to Keras

L

  • language grounding
    • about / Language grounding
  • Large Scale Visual Recognition Challenge (LSVRC) / History of deep learning
  • Latent Dirichlet Allocation (LDA)
    • about / Topic embedding
  • Latent Semantic Analysis (LSA) / GloVe – Global Vectors representation
  • learning model
    • optimizing / Optimizing the learning model
  • lemmas / Tour of WordNet
  • Long Short-Term Memory (LSTM) / Document classification with Word2vec
  • loss function
    • formulating / Formulating a practical loss function
    • approximating / Efficiently approximating the loss function
    / The loss function
  • LSTM-Word2vec
    • examples generated with / Examples generated with LSTM-Word2vec and beam search
  • LSTM cell
    • defining / Defining the LSTM
  • LSTM implementation
    • about / Implementing an LSTM
    • hyperparameters, defining / Defining hyperparameters
    • parameters, defining / Defining parameters
    • LSTM cell operations, defining / Defining an LSTM cell and its operations
    • inputs and labels, defining / Defining inputs and labels
    • sequential calculations, defining / Defining sequential calculations required to process sequential data
    • optimizer, defining / Defining the optimizer
    • predictions, making / Making predictions
    • perplexity, calculating / Calculating perplexity (loss)
    • states, resetting / Resetting states
    • greedy sampling, for breaking unimodality / Greedy sampling to break unimodality
    • new text, generating / Generating new text
    • example generated text / Example generated text
  • LSTMs
    • about / Understanding Long Short-Term Memory Networks, What is an LSTM?, LSTMs in more detail
    • cell state / What is an LSTM?
    • hidden state / What is an LSTM?
    • input gate / What is an LSTM?, LSTMs in more detail
    • forget gate / What is an LSTM?, LSTMs in more detail
    • output gate / What is an LSTM?
    • actual mechanism / LSTMs in more detail
    • exploring / LSTMs in more detail
    • write gate / LSTMs in more detail
    • output / LSTMs in more detail
    • comparing, with standard RNNs / How LSTMs differ from standard RNNs
    • vanishing gradient problem, solving / How LSTMs solve the vanishing gradient problem
    • improving / Improving LSTMs
    • greedy sampling / Greedy sampling
    • beam search / Beam search, Improving LSTMs – beam search
    • word vectors, using / Using word vectors
    • BiLSTM / Bidirectional LSTMs (BiLSTM)
    • variants / Other variants of LSTMs
    • comparing / Comparing LSTMs to LSTMs with peephole connections and GRUs
    • standard LSTM / Standard LSTM
    • Gated Recurrent Units (GRUs) / Gated Recurrent Units (GRUs)
    • , with peepholes / LSTMs with peepholes
    • perplexity over time / Training and validation perplexities over time
    • beam search, implementing / Implementing beam search
    • text generation, with words / Improving LSTMs – generating text with words instead of n-grams
  • LSTMs, with peepholes
    • about / LSTMs with peepholes
    • review / Review
    • code / The code
    • example generated text / Example generated text

M

  • machine translation (MT)
    • about / Machine translation
    • history / A brief historical tour of machine translation
  • machine translation systems
    • evaluating / The BLEU score – evaluating the machine translation systems
  • many-to-many RNNs / Many-to-many RNNs
  • many-to-one RNNs / Many-to-one RNNs
  • marginal probability / Marginal probability
  • mathematical operations / Mathematical operations
  • Matplotlib
    • URL, for installing / Installing Python and scikit-learn
  • matrix
    • about / Matrices
    • indexing / Indexing of a matrix
    • identity matrix / Identity matrix
    • diagonal matrix / Diagonal matrix
    • tensors / Tensors
  • matrix operations
    • multiplication / Multiplication
    • element-wise multiplication / Element-wise multiplication
    • inverse / Inverse
    • matrix inverse, finding / Finding the matrix inverse – Singular Value Decomposition (SVD)
    • norms / Norms
    • determinant / Determinant
  • max pooling operation
    • about / Max pooling
    • stride, using / Max pooling with stride
  • Meronyms / Tour of WordNet
  • Metric for Evaluation of Translation with Explicit Ordering (METEOR)
    • about / METEOR
  • MNIST dataset
    • reference / Implementing our first neural network
  • MS-COCO dataset
    • URL / Getting to know the data
    • about / The MS-COCO dataset
  • MultiModel
    • about / One Model to Learn Them All
    • convolutional block / One Model to Learn Them All
    • attention block / One Model to Learn Them All
    • mixture of experts block / One Model to Learn Them All
    • tasks, performing / One Model to Learn Them All
  • MultiWordNet (MWN) / Problems with WordNet

N

  • n-table
    • about / Statistical Machine Translation (SMT)
  • Natural Language Processing (NLP)
    • about / What is Natural Language Processing?, The current state of deep learning and NLP, The roadmap – beyond this chapter
    • tasks / Tasks of Natural Language Processing
    • Tokenization / Tasks of Natural Language Processing
    • Word-sense Disambiguation (WSD) / Tasks of Natural Language Processing
    • Named Entity Recognition (NER) / Tasks of Natural Language Processing
    • Part-of-Speech (PoS) tagging / Tasks of Natural Language Processing
    • Sentence/Synopsis classification / Tasks of Natural Language Processing
    • language generation / Tasks of Natural Language Processing
    • Question Answering (QA) / Tasks of Natural Language Processing
    • Machine Translation (MT) / Tasks of Natural Language Processing
    • traditional approach / The traditional approach to Natural Language Processing, Understanding the traditional approach
    • example / Example – generating football game summaries
    • preprocessing / Example – generating football game summaries
    • tokenization / Example – generating football game summaries
    • feature engineering / Example – generating football game summaries
    • bag-of-words / Example – generating football game summaries
    • n-gram / Example – generating football game summaries
    • traditional approach, drawbacks / Drawbacks of the traditional approach
    • deep learning approach / The deep learning approach to Natural Language Processing
  • negative sampling
    • unigram distribution, using for / Using the unigram distribution for negative sampling
  • Neural Machine Translation (NMT)
    • about / Neural Machine Translation (NMT), Understanding Neural Machine Translation, Neural Machine Translation (NMT)
    • intuition / Intuition behind NMT
    • architecture / NMT architecture
    • encoder / NMT architecture
    • decoder / NMT architecture
    • training / Training the NMT
    • inference, performing / Inference with NMT
    • attention mechanism, improving / Improving the attention mechanism
    • hybrid MT models / Hybrid MT models
  • neural network
    • implementing / Implementing our first neural network
    • data, preparing / Preparing the data
    • TensorFlow graph, defining / Defining the TensorFlow graph
    • executing / Running the neural network
    • word embeddings, learning / Learning the word embeddings with a neural network
  • neural network-related operations
    • about / Neural network-related operations
    • nonlinear activations / Nonlinear activations used by neural networks
    • convolution operation / The convolution operation
    • pooling operation / The pooling operation
    • loss, defining / Defining loss
    • neural networks, optimization / Optimization of neural networks
    • control flow operations / The control flow operations
  • newer machine learning models
    • about / Newer machine learning models
    • Phased LSTM / Phased LSTM
    • Dilated Recurrent Neural Networks (DRNNs) / Dilated Recurrent Neural Networks (DRNNs)
  • NLP
    • current trends / Current trends in NLP
  • NLP, for social media
    • about / NLP for social media
    • rumors, detecting in social media / Detecting rumors in social media
    • emotions, detecting in social media / Detecting emotions in social media
    • political framing, analyzing in tweets / Analyzing political framing in tweets
  • NLP, with computer vision
    • combining / Combining NLP with computer vision
    • Visual Question Answering (VQA) / Visual Question Answering (VQA)
    • caption generation for images, with attention / Caption generation for images with attention
  • NLTK
    • reference / Tour of WordNet
  • NMT, jointly with word embeddings
    • training / Training an NMT jointly with word embeddings
    • matchings between dataset vocabulary and pretrained embeddings, maximizing / Maximizing matchings between the dataset vocabulary and the pretrained embeddings
    • embeddings layer, defining as TensorFlow variable / Defining the embeddings layer as a TensorFlow variable
  • NMT architecture
    • about / NMT architecture
    • embedding layer / The embedding layer
    • encoder / The encoder
    • context vector / The context vector
    • decoder / The decoder
  • NMT implementation
    • performing, from scratch / Implementing an NMT from scratch – a German to English translator
    • word embeddings / Learning word embeddings
    • encoder, defining / Defining the encoder and the decoder
    • decoder, defining / Defining the encoder and the decoder
    • end-to-end output calculation, defining / Defining the end-to-end output calculation
    • translation results / Some translation results
  • NMTs, improving
    • about / Improving NMTs
    • teacher forcing / Teacher forcing
    • deep LSTMs / Deep LSTMs
  • NMT system
    • data, preparing / Preparing data for the NMT system
  • node / TensorFlow architecture – what happens when you execute the client?
  • Noise-Contrastive Estimation (NCE) / Negative sampling of the softmax layer

O

  • object-pair nodes
    • about / Language grounding
  • one-hot encoded representation / One-hot encoded representation
  • one-hot encoding / Classical approaches to learning word representation
  • one-to-many RNNs / One-to-many RNNs
  • one-to-one RNNs / One-to-one RNNs
  • operations
    • about / Inputs, variables, outputs, and operations
    • defining / Defining TensorFlow operations
    • comparison operations / Comparison operations
    • mathematical operations / Mathematical operations
    • scatter operations / Scatter and gather operations
    • gather operations / Scatter and gather operations
    • neural network-related operations / Neural network-related operations
  • original skip-gram algorithm
    • about / The original skip-gram algorithm
    • implementing / Implementing the original skip-gram algorithm
    • versus improved skip-gram algorithm / Comparing the original skip-gram with the improved skip-gram
  • outliers
    • inspecting / Inspecting several outliers
  • outputs
    • about / Inputs, variables, outputs, and operations
    • defining / Defining TensorFlow outputs

P

  • parameters
    • defining / Defining parameters
  • parameters, TensorFlow RNN API
    • cell / Using the TensorFlow RNN API
    • input_keep_prob / Using the TensorFlow RNN API
    • output_keep_prob / Using the TensorFlow RNN API
    • state_keep_prob / Using the TensorFlow RNN API
    • variational_recurrent / Using the TensorFlow RNN API
  • peephole connections
    • about / Peephole connections
  • perplexity / Perplexity – measuring the quality of the text result
  • perplexity over time
    • about / Training and validation perplexities over time
  • Phased LSTM
    • about / Phased LSTM
  • placeholder / Feeding data with Python code
  • pooling operation
    • about / Pooling operation
    • max pooling / Max pooling
    • average pooling / Average pooling
  • pretrained embeddings, using with TensorFlow RNN API
    • about / Using pretrained embeddings with TensorFlow RNN API
    • pretrained embedding layer, defining / Defining the pretrained embedding layer and the adaptation layer
    • adaptation layer, defining / Defining the pretrained embedding layer and the adaptation layer
    • LSTM cell, defining / Defining the LSTM cell and softmax layer
    • softmax layer, defining / Defining the LSTM cell and softmax layer
    • inputs and outputs, defining / Defining inputs and outputs
    • images and text, processing differently / Processing images and text differently
    • LSTM output calculation, defining / Defining the LSTM output calculation
    • logits and predictions, defining / Defining the logits and predictions
    • sequence loss, defining / Defining the sequence loss
    • optimizer, defining / Defining the optimizer
  • pretrained GloVe word vectors
    • TensorFlow RNN API, using with / Using TensorFlow RNN API with pretrained GloVe word vectors
  • pretrained models / Extracting image features with CNNs
  • Principal Component Analysis (PCA) / Finding the matrix inverse – Singular Value Decomposition (SVD)
  • probabilistic word embedding
    • about / Probabilistic word embedding
  • probability
    • about / Probability
    • random variables / Random variables
    • discrete random variables / Discrete random variables
    • continuous random variables / Continuous random variables
    • mass/density function / The probability mass/density function
    • conditional probability / Conditional probability
    • joint probability / Joint probability
    • marginal probability / Marginal probability
    • Bayes' rule / Bayes' rule
  • probability density function (PDF) / The probability mass/density function
  • probability mass function (PMF) / The probability mass/density function
  • PSDVec
    • about / Probabilistic word embedding

R

  • r-table
    • about / Statistical Machine Translation (SMT)
  • random variables / Random variables
  • raw text
    • to structured data / From raw text to structured data
  • Rectified Linear Units (ReLUs) / History of deep learning
  • Recurrent Neural Network (RNN) / The current state of deep learning and NLP
  • Recurrent Neural Networks (RNNs)
    • about / Understanding Recurrent Neural Networks
    • modeling / Modeling with Recurrent Neural Networks
    • technical description / Technical description of a Recurrent Neural Network
    • applications / Applications of RNNs
    • used, for text generation / Generating text with RNNs
    • text results output, evaluating / Evaluating text results output from the RNN
    • with Context Features / Recurrent Neural Networks with Context Features – RNNs with longer memory
  • Recurrent Neural Networks (RNNs), applications
    • one-to-one RNNs / One-to-one RNNs
    • one-to-many RNNs / One-to-many RNNs
    • many-to-one RNNs / Many-to-one RNNs
    • many-to-many RNNs / Many-to-many RNNs
  • region embedding
    • about / Region embedding
    • input representation / Input representation
    • learning / Learning region embeddings
    • implementing / Implementation – region embeddings
    • classification accuracy / Classification accuracy
  • reinforcement learning (RL)
    • about / Reinforcement learning
    • unique language for communication, teaching to agents / Teaching agents to communicate using their own language
    • dialogue agents / Dialogue agents with reinforcement learning
  • reordering phase
    • about / Statistical Machine Translation (SMT)
  • research fields
    • penetration into / Penetration into other research fields
    • NLP, combining with computer vision / Combining NLP with computer vision
    • reinforcement learning / Reinforcement learning
    • Generative Adversarial Networks / Generative Adversarial Networks for NLP
  • RNNs with Context Features (RNN-CF)
    • about / Evaluating text results output from the RNN
    • technical description / Technical description of the RNN-CF
    • implementing / Implementing the RNN-CF
    • hyperparameters, defining / Defining the RNN-CF hyperparameters
    • weights, defining / Defining weights of the RNN-CF, Variables and operations for maintaining hidden and context states
    • output, calculating / Calculating output
    • validation output, calculating / Calculating validation output
    • gradients, optimizing / Computing the gradients and optimizing
    • gradients, calculating / Computing the gradients and optimizing
    • text generated / Text generated with the RNN-CF
  • rule-based translation
    • about / Rule-based translation

S

  • sarcasm
    • about / Detecting sarcasm
    • detecting / Detecting sarcasm
  • scalar / Scalar
  • scatter operation / Scatter and gather operations
  • scoping / Reusing variables with scoping
  • sentence classification, with CNN
    • about / Using CNNs for sentence classification
    • pooling over time / Pooling over time
    • implementation / Implementation – sentence classification with CNNs
  • Seq2Seq models
    • chatbots / Other applications of Seq2Seq models – chatbots
  • sequence one-hot-encoded vector / Input representation
  • Singular Value Decomposition (SVD) / Finding the matrix inverse – Singular Value Decomposition (SVD)
  • skimming text, with LSTMs
    • about / Skimming text with LSTMs
  • skip-gram, versus CBOW
    • about / Comparing skip-gram with CBOW, Which is the winner, skip-gram or CBOW?
    • performance comparison / Performance comparison
  • skip-gram algorithm / The skip-gram algorithm
    • raw text, to structured data / From raw text to structured data
    • word embeddings, learning with neural network / Learning the word embeddings with a neural network
    • implementing, with TensorFlow / Implementing skip-gram with TensorFlow
    • implementing / Implementing the original skip-gram algorithm
    • extending / More recent algorithms extending skip-gram and CBOW
    • limitation / A limitation of the skip-gram algorithm
  • softmax layer
    • negative sampling / Negative sampling of the softmax layer
  • SOS / Preparing captions for feeding into LSTMs
  • standard LSTMs
    • about / Standard LSTM
    • reviewing / Review
    • example generated text / Example generated text
  • Statistical Machine Translation (SMT)
    • about / Statistical Machine Translation (SMT)
  • structured data
    • from raw text / From raw text to structured data
  • structured skip-gram algorithm / The structured skip-gram algorithm
  • subsampling
    • about / Subsampling – probabilistically ignoring the common words
    • implementing / Implementing subsampling
  • synset / Tour of WordNet

T

  • t-Distributed Stochastic Neighbor Embedding (t-SNE) / Performance comparison
  • tasks emerging
    • about / New tasks emerging
    • sarcasm, detecting / Detecting sarcasm
    • language grounding / Language grounding
    • skimming text, with LSTMs / Skimming text with LSTMs
  • teacher forcing
    • about / Teacher forcing
  • technical tools
    • about / Introduction to the technical tools
    • describing / Description of the tools
    • Python, installing / Installing Python and scikit-learn
    • scikit-learn, installing / Installing Python and scikit-learn
    • Jupyter Notebook, installing / Installing Jupyter Notebook
    • TensorFlow, installing / Installing TensorFlow
  • tensor / Getting started with TensorFlow, Tensors
  • TensorBoard
    • word embeddings, visualizing / Visualizing word embeddings with TensorBoard
    • starting with / Starting TensorBoard
    • word embeddings, saving / Saving word embeddings and visualizing via TensorBoard
    • visualizing / Saving word embeddings and visualizing via TensorBoard
  • TensorFlow
    • URL, for installing / Installing TensorFlow
    • about / What is TensorFlow?
    • reference / What is TensorFlow?
    • using / Getting started with TensorFlow, Building an input pipeline
    • architecture / TensorFlow architecture – what happens when you execute the client?
    • architecture, reference / TensorFlow architecture – what happens when you execute the client?
    • Cafe Le TensorFlow / Cafe Le TensorFlow – understanding TensorFlow with an analogy
    • Continuous Bag-Of-Words algorithm, implementing / Implementing CBOW in TensorFlow
  • TensorFlow client / TensorFlow client in detail
  • TensorFlow implementation
    • URL / One Model to Learn Them All
  • TensorFlow placeholders
    • enc_train_inputs / Defining the encoder and the decoder
    • dec_train_inputs / Defining the encoder and the decoder
    • dec_train_labels / Defining the encoder and the decoder
    • dec_train_masks / Defining the encoder and the decoder
  • TensorFlow Research Cloud (TFRC)
    • URL / Installing TensorFlow
  • TensorFlow RNN API
    • using / Using the TensorFlow RNN API
    • using, with pretrained GloVe word vectors / Using TensorFlow RNN API with pretrained GloVe word vectors
    • pretrained embeddings, using with / Using pretrained embeddings with TensorFlow RNN API
  • TensorFlow seq2seq library
    • about / Introduction to the TensorFlow seq2seq library
    • embeddings, defining for encoder and decoder / Defining embeddings for the encoder and decoder
    • encoder, defining / Defining the encoder
    • decoder, defining / Defining the decoder
  • Term Frequency-Inverse Document Frequency (TF-IDF) / Classical approaches to learning word representation
  • text generation, with RNNs
    • about / Generating text with RNNs, Defining hyperparameters
    • inputs over time, unrolling for Truncated BPTT / Unrolling the inputs over time for Truncated BPTT
    • inputsvalidation dataset, defining / Defining the validation dataset
    • weights and biases, defining / Defining weights and biases
    • state persisting variables, defining / Defining state persisting variables
    • hidden states and outputs, calculating with unrolled inputs / Calculating the hidden states and outputs with unrolled inputs
    • loss, calculating / Calculating the loss
    • validation output, calculating / Calculating validation output
    • gradients, calculating / Calculating gradients and optimizing
    • optimizing / Calculating gradients and optimizing
    • generated chunk of text, outputting / Outputting a freshly generated chunk of text
  • text generation, with words in LSTMs
    • about / Improving LSTMs – generating text with words instead of n-grams
    • curse of dimensionality / The curse of dimensionality
    • Word2vec / Word2vec to the rescue
    • text, generating with Word2vec / Generating text with Word2vec
    • perplexity over time / Perplexity over time
  • text result
    • output, evaluating from RNN / Evaluating text results output from the RNN
    • quality, measuring / Perplexity – measuring the quality of the text result
  • TF-IDF method / The TF-IDF method
  • Topical Word Embeddings (TWE)
    • about / Topic embedding
  • topic embedding
    • about / Topic embedding
  • translation phase
    • about / Statistical Machine Translation (SMT)
  • transpose
    • about / Transpose
    • example / Transpose
  • transposed convolution / Transposed convolution
  • Truncated Backpropagation Through Time (TBPTT)
    • about / Backpropagation Through Time – training RNNs
    • RNNs, training / Truncated BPTT – training RNNs efficiently
    • limitations / Limitations of BPTT – vanishing and exploding gradients
    • exploding gradient / Limitations of BPTT – vanishing and exploding gradients
    • vanishing gradient / Limitations of BPTT – vanishing and exploding gradients
  • Turing test
    • about / Evaluating chatbots – Turing test
  • tv-embedding
    • about / Region embedding

U

  • unigram-based negative sampling
    • implementing / Implementing unigram-based negative sampling
  • unigram distribution
    • using, for negative sampling / Using the unigram distribution for negative sampling

V

  • vanishing gradients phenomenon / History of deep learning
  • variables
    • about / Inputs, variables, outputs, and operations
    • defining / Defining variables in TensorFlow
    • reusing, with scoping / Reusing variables with scoping
  • variants, LSTMs
    • peephole connections / Peephole connections
    • GRUs / Gated Recurrent Units
  • variational inference
    • about / Probabilistic word embedding
  • vectors / Vectors
  • velocity term / Limitations of BPTT – vanishing and exploding gradients
  • VGG-16
    • predicting with / Predicting class probabilities with VGG-16
  • VGG-16 inferring
    • about / Inferring VGG-16
  • VGG CNN
    • URL / Extracting image features with CNNs
  • Virtual Assistants (VAs)
    • about / What is Natural Language Processing?
  • Visual Question Answering (VQA)
    • about / Visual Question Answering (VQA)

W

  • weights loading, CNN
    • implementation / Implementation – loading weights and inferencing with VGG-
    • variables, building / Building and updating variables
    • variables, updating / Building and updating variables
    • inputs, preprocessing / Preprocessing inputs
    • vectorized representations of images, extracting / Extracting vectorized representations of images
  • whitening / Preparing the data
  • Word2vec
    • about / Word2vec – a neural network-based approach to learning word representation, Word2vec to the rescue
    • exercise / Exercise: is queen = king – he + she?
    • loss function, designing for learning word embeddings / Designing a loss function for learning word embeddings
    • document classification / Document classification with Word2vec
    • text, generating with / Generating text with Word2vec
  • word alignment problem / Tasks of Natural Language Processing
  • word embeddings
    • learning, with neural network / Learning the word embeddings with a neural network
    • documents, classifying with / Classifying documents with word embeddings
    • learning / Implementation – learning word embeddings
    • to document embeddings / Implementation – word embeddings to document embeddings
    • about / Learning word embeddings, Word embeddings
    • region embedding / Region embedding
    • probabilistic word embedding / Probabilistic word embedding
    • ensemble embedding / Ensemble embedding
    • visualizing, with TensorBoard / Visualizing word embeddings with TensorBoard
  • word embeddings algorithms
    • about / Extensions to the word embeddings algorithms
  • word meaning / What is a word representation or meaning?
  • WordNet
    • about / WordNet – using an external lexical knowledge base for learning word representations, Tour of WordNet
    • reference / Tour of WordNet
    • issues / Problems with WordNet
  • word representation / What is a word representation or meaning?
    • learning, classical approaches / Classical approaches to learning word representation
    • one-hot encoded representation / One-hot encoded representation
    • TF-IDF method / The TF-IDF method
    • co-occurrence matrix / Co-occurrence matrix
  • word vectors
    • using / Using word vectors

X

  • Xavier initialization / Defining parameters, Defining the encoder and the decoder