# Index

## A

- Amazon Web Services (AWS)
- URL / Installing TensorFlow

- Anaconda
- references / Installing Python and scikit-learn

- Application Programming Interface (API)
- reference / What is TensorFlow?

- arguments, loss function
- logits / Using the TensorFlow RNN API
- targets / Using the TensorFlow RNN API
- weights / Using the TensorFlow RNN API
- average_across_timesteps / Using the TensorFlow RNN API
- average_across_batch / Using the TensorFlow RNN API

- Artificial General Intelligence (AGI)
- about / Towards Artificial General Intelligence
- MultiModel / One Model to Learn Them All
- joint many-task model / A joint many-task model – growing a neural network for multiple NLP tasks

- attention matrix
- about / Visualizing attention for source and target sentences

- attention mechanism
- about / Attention
- context vector bottleneck, breaking / Breaking the context vector bottleneck
- implementation / The attention mechanism in detail, Implementing the attention mechanism
- weights, defining / Defining weights
- attention, computing / Computing attention
- translation results / Some translation results – NMT with attention
- attention, visualizing for source and target sentences / Visualizing attention for source and target sentences

- Automatic Language Processing Advisory Committee (ALPAC)
- about / Rule-based translation

## B

- Backpropagation Through Time (BPTT)
- about / Backpropagation Through Time
- working / How backpropagation works
- direct use, limitations / Why we cannot use BP directly for RNNs
- RNNs, training / Backpropagation Through Time – training RNNs

- Bag-of-Words (BOW) representation / Input representation
- Bayes' rule / Bayes' rule
- beam search
- about / Beam search, Improving LSTMs – beam search
- implementing / Implementing beam search
- examples generated / Examples generated with beam search

- bidirectional LSTMs (BiLSTM)
- about / Bidirectional LSTMs (BiLSTM)

- Bigger Analogy Test Set (BATS)
- reference / Performance comparison

- BLEU-4
- about / BLEU-4 over time for our model

- BLEU score
- about / The BLEU score – evaluating the machine translation systems
- modified precision / Modified precision
- brevity penalty / Brevity penalty
- calculating / The final BLEU score

## C

- captions
- preparing, for feeding into LSTMs / Preparing captions for feeding into LSTMs
- generating, for test images / Captions generated for test images

- CBOW
- comparing / Comparing the CBOW and its extensions
- extending / More recent algorithms extending skip-gram and CBOW

- CBOW(Unigram) / Comparing the CBOW and its extensions
- CBOW (Unigram+Subsampling) / Comparing the CBOW and its extensions
- Central Processing Units (CPUs) / History of deep learning
- chatbot
- about / Other applications of Seq2Seq models – chatbots
- training / Training a chatbot
- evaluating / Evaluating chatbots – Turing test

- CNNs
- used, for extracting image features / Extracting image features with CNNs

- co-occurrence matrix / Co-occurrence matrix
- comparison operations / Comparison operations
- Compute Unified Device Architecture (CUDA) / The roadmap – beyond this chapter, What is TensorFlow?
- concept / GloVe – Global Vectors representation
- conditional probability / Conditional probability
- consensus-based Image Description Evaluation (CIDEr)
- about / CIDEr

- Continuous Bag-of-Words (CBOW) model / Generating text with Word2vec
- about / Learning word embeddings

- Continuous Bag-Of-Words algorithm / The Continuous Bag-of-Words algorithm
- implementing, in TensorFlow / Implementing CBOW in TensorFlow

- continuous random variables / Continuous random variables
- continuous window model / The continuous window model
- Convolution Neural Network (CNN) / The current state of deep learning and NLP
- Convolution Neural Networks (CNN)
- about / Introducing Convolution Neural Networks, Understanding Convolution Neural Networks
- fundamentals / CNN fundamentals
- importance / The power of Convolution Neural Networks
- filter size / Understanding Convolution Neural Networks
- stride / Understanding Convolution Neural Networks
- padding / Understanding Convolution Neural Networks
- operation / Convolution operation
- fully connected layers / Fully connected layers
- summarizing / Putting everything together
- used, for image classification on MNIST / Exercise – image classification on MNIST with CNN, About the data
- MNIST dataset / About the data
- implementing / Implementing the CNN
- produced predictions / Analyzing the predictions produced with a CNN
- used, for sentence classification / Using CNNs for sentence classification

- Convolution Neural Networks (CNN) structure
- about / CNN structure
- data transformation / Data transformation
- convolution operation / The convolution operation

- Convolution Neural Networks (CNNs) / Document classification with Word2vec
- convolution operation
- about / Standard convolution operation
- stride, using / Convolving with stride
- padding, using / Convolving with padding
- transposed convolution / Transposed convolution

- current trends, in NLP
- word embeddings / Word embeddings
- Neural Machine Translation (NMT) / Neural Machine Translation (NMT)

## D

- data
- preloading, as tensors / Preloading and storing data as tensors
- storing, as tensors / Preloading and storing data as tensors
- about / Our data
- preprocessing / Preprocessing data
- generating, for LSTMs / Generating data for LSTMs

- data preparation, NMT system
- about / Preparing data for the NMT system
- training data / At training time
- source sentence, reversing / Reversing the source sentence
- testing time / At testing time

- dataset
- about / About the dataset
- text snippet / About the dataset

- data structures
- scalar / Scalar
- vectors / Vectors
- matrix / Matrices

- deconvolution / Transposed convolution
- deep learning approach
- to Natural Language Processing (NLP) / The deep learning approach to Natural Language Processing
- history / History of deep learning
- about / The current state of deep learning and NLP

- diagonal matrix / Diagonal matrix
- Dilated Recurrent Neural Networks (DRNNs)
- about / Newer machine learning models, Dilated Recurrent Neural Networks (DRNNs)

- discrete random variables / Discrete random variables
- document classification, with Word2vec
- about / Document classification with Word2vec
- dataset / Dataset

- documents
- classifying, with word embeddings / Classifying documents with word embeddings

- Dynamic-Series Time Structure (DSTS) / Detecting rumors in social media

## E

- embedded documents
- document clustering / Document clustering and t-SNE visualization of embedded documents
- t-SNE visualization / Document clustering and t-SNE visualization of embedded documents

- ensemble embedding
- about / Ensemble embedding

- EOS / Preparing captions for feeding into LSTMs

## F

- feed-forward neural networks
- problem / The problem with feed-forward neural networks

- frame nodes
- about / Language grounding

- Fully Connected Neural Network (FCNN) / Understanding a simple deep model – a Fully-Connected Neural Network

## G

- Gated Recurrent Units (GRUs)
- about / Gated Recurrent Units (GRUs)
- review / Review
- code / The code
- example generated text / Example generated text

- gather operations / Scatter and gather operations
- Gaussian Integral
- reference / The probability mass/density function

- Generative Adversarial Models (GANs) / Hybrid MT models
- Generative Adversarial Networks, for NLP
- about / Generative Adversarial Networks for NLP

- GloVe
- about / GloVe – Global Vectors representation
- example / Understanding GloVe
- implementing / Implementing GloVe

- GloVe word vectors
- loading / Loading GloVe word vectors, Cleaning data
- URL / Loading GloVe word vectors

- Google analogy dataset
- reference / Performance comparison

- Google Cloud Platform (GCP)
- URL / Installing TensorFlow

- Google Neural Machine Translation (GNMT) system / Improving NMTs
- Graphical Processing Units (GPUs) / History of deep learning
- Graphical User Interface (GUI)
- reference / Tour of WordNet

- greedy sampling
- about / Greedy sampling

- Group Method of Data Handling (GMDH) / History of deep learning
- GRUs
- about / Gated Recurrent Units

## H

- Hidden Markov Model (HMM) / Example – generating football game summaries
- hierarchical softmax / Hierarchical softmax
- hierarchy
- learning / Learning the hierarchy
- initializing / Learning the hierarchy
- WordNet, determining / Learning the hierarchy

- history, machine translation (MT)
- about / A brief historical tour of machine translation
- rule-based translation / Rule-based translation
- Statistical Machine Translation (SMT) / Statistical Machine Translation (SMT)
- Neural Machine Translation (NMT) / Neural Machine Translation (NMT)

- Holonyms / Tour of WordNet
- hypernyms / Tour of WordNet
- hyperparameters
- defining / Defining hyperparameters
- num_nodes / Defining hyperparameters, Defining the encoder and the decoder
- batch_size / Defining hyperparameters, Defining the encoder and the decoder
- num_unrollings / Defining hyperparameters
- dropout / Defining hyperparameters
- dec_num_unrollings / Defining the encoder and the decoder
- embedding_size / Defining the encoder and the decoder

- hyponyms / Tour of WordNet

## I

- identity matrix / Identity matrix
- ILSVRC ImageNet dataset
- URL / Getting to know the data
- about / ILSVRC ImageNet dataset

- image caption generation
- machine learning pipeline / The machine learning pipeline for image caption generation

- image caption generation pipeline
- about / The machine learning pipeline for image caption generation

- image features
- extracting, with CNNs / Extracting image features with CNNs

- ImageNet Large Scale Visual Recognition Challenge (ILSVRC) / The power of Convolution Neural Networks
- improved skip-gram algorithm
- versus original skip-gram algorithm / Comparing the original skip-gram with the improved skip-gram

- inferring
- about / Inferring VGG-16

- information technology (IT)
- about / Topic embedding

- input and output placeholders
- is_train_text / Defining inputs and outputs
- train_inputs / Defining inputs and outputs
- train_labels / Defining inputs and outputs

- input gate parameters
- ix / Defining parameters
- im / Defining parameters
- ib / Defining parameters

- inputs
- about / Inputs, variables, outputs, and operations
- defining / Defining inputs in TensorFlow
- data, feeding with Python code / Feeding data with Python code
- pipeline, building / Building an input pipeline

- insertion phase
- about / Statistical Machine Translation (SMT)

## J

- joint many-task model
- about / A joint many-task model – growing a neural network for multiple NLP tasks, Third level – semantic-level tasks
- word-based tasks / First level – word-based tasks
- syntactic tasks / Second level – syntactic tasks
- semantic-level tasks / Third level – semantic-level tasks

- joint probability / Joint probability
- Jupyter Notebook
- URL, for installing / Installing Jupyter Notebook

## K

- K-means
- documents, clustering / Implementation – clustering/classification of documents with K-means
- documents, classifying / Implementation – clustering/classification of documents with K-means

- Keras
- about / Introduction to Keras

## L

- language grounding
- about / Language grounding

- Large Scale Visual Recognition Challenge (LSVRC) / History of deep learning
- Latent Dirichlet Allocation (LDA)
- about / Topic embedding

- Latent Semantic Analysis (LSA) / GloVe – Global Vectors representation
- learning model
- optimizing / Optimizing the learning model

- lemmas / Tour of WordNet
- Long Short-Term Memory (LSTM) / Document classification with Word2vec
- loss function
- formulating / Formulating a practical loss function
- approximating / Efficiently approximating the loss function

- LSTM-Word2vec
- examples generated with / Examples generated with LSTM-Word2vec and beam search

- LSTM cell
- defining / Defining the LSTM

- LSTM implementation
- about / Implementing an LSTM
- hyperparameters, defining / Defining hyperparameters
- parameters, defining / Defining parameters
- LSTM cell operations, defining / Defining an LSTM cell and its operations
- inputs and labels, defining / Defining inputs and labels
- sequential calculations, defining / Defining sequential calculations required to process sequential data
- optimizer, defining / Defining the optimizer
- predictions, making / Making predictions
- perplexity, calculating / Calculating perplexity (loss)
- states, resetting / Resetting states
- greedy sampling, for breaking unimodality / Greedy sampling to break unimodality
- new text, generating / Generating new text
- example generated text / Example generated text

- LSTMs
- about / Understanding Long Short-Term Memory Networks, What is an LSTM?, LSTMs in more detail
- cell state / What is an LSTM?
- hidden state / What is an LSTM?
- input gate / What is an LSTM?, LSTMs in more detail
- forget gate / What is an LSTM?, LSTMs in more detail
- output gate / What is an LSTM?
- actual mechanism / LSTMs in more detail
- exploring / LSTMs in more detail
- write gate / LSTMs in more detail
- output / LSTMs in more detail
- comparing, with standard RNNs / How LSTMs differ from standard RNNs
- vanishing gradient problem, solving / How LSTMs solve the vanishing gradient problem
- improving / Improving LSTMs
- greedy sampling / Greedy sampling
- beam search / Beam search, Improving LSTMs – beam search
- word vectors, using / Using word vectors
- BiLSTM / Bidirectional LSTMs (BiLSTM)
- variants / Other variants of LSTMs
- comparing / Comparing LSTMs to LSTMs with peephole connections and GRUs
- standard LSTM / Standard LSTM
- Gated Recurrent Units (GRUs) / Gated Recurrent Units (GRUs)
- , with peepholes / LSTMs with peepholes
- perplexity over time / Training and validation perplexities over time
- beam search, implementing / Implementing beam search
- text generation, with words / Improving LSTMs – generating text with words instead of n-grams

- LSTMs, with peepholes
- about / LSTMs with peepholes
- review / Review
- code / The code
- example generated text / Example generated text

## M

- machine translation (MT)
- about / Machine translation
- history / A brief historical tour of machine translation

- machine translation systems
- evaluating / The BLEU score – evaluating the machine translation systems

- many-to-many RNNs / Many-to-many RNNs
- many-to-one RNNs / Many-to-one RNNs
- marginal probability / Marginal probability
- mathematical operations / Mathematical operations
- Matplotlib
- URL, for installing / Installing Python and scikit-learn

- matrix
- about / Matrices
- indexing / Indexing of a matrix
- identity matrix / Identity matrix
- diagonal matrix / Diagonal matrix
- tensors / Tensors

- matrix operations
- multiplication / Multiplication
- element-wise multiplication / Element-wise multiplication
- inverse / Inverse
- matrix inverse, finding / Finding the matrix inverse – Singular Value Decomposition (SVD)
- norms / Norms
- determinant / Determinant

- max pooling operation
- about / Max pooling
- stride, using / Max pooling with stride

- Meronyms / Tour of WordNet
- Metric for Evaluation of Translation with Explicit Ordering (METEOR)
- about / METEOR

- MNIST dataset
- reference / Implementing our first neural network

- MS-COCO dataset
- URL / Getting to know the data
- about / The MS-COCO dataset

- MultiModel
- about / One Model to Learn Them All
- convolutional block / One Model to Learn Them All
- attention block / One Model to Learn Them All
- mixture of experts block / One Model to Learn Them All
- tasks, performing / One Model to Learn Them All

- MultiWordNet (MWN) / Problems with WordNet

## N

- n-table
- about / Statistical Machine Translation (SMT)

- Natural Language Processing (NLP)
- about / What is Natural Language Processing?, The current state of deep learning and NLP, The roadmap – beyond this chapter
- tasks / Tasks of Natural Language Processing
- Tokenization / Tasks of Natural Language Processing
- Word-sense Disambiguation (WSD) / Tasks of Natural Language Processing
- Named Entity Recognition (NER) / Tasks of Natural Language Processing
- Part-of-Speech (PoS) tagging / Tasks of Natural Language Processing
- Sentence/Synopsis classification / Tasks of Natural Language Processing
- language generation / Tasks of Natural Language Processing
- Question Answering (QA) / Tasks of Natural Language Processing
- Machine Translation (MT) / Tasks of Natural Language Processing
- traditional approach / The traditional approach to Natural Language Processing, Understanding the traditional approach
- example / Example – generating football game summaries
- preprocessing / Example – generating football game summaries
- tokenization / Example – generating football game summaries
- feature engineering / Example – generating football game summaries
- bag-of-words / Example – generating football game summaries
- n-gram / Example – generating football game summaries
- traditional approach, drawbacks / Drawbacks of the traditional approach
- deep learning approach / The deep learning approach to Natural Language Processing

- negative sampling
- unigram distribution, using for / Using the unigram distribution for negative sampling

- Neural Machine Translation (NMT)
- about / Neural Machine Translation (NMT), Understanding Neural Machine Translation, Neural Machine Translation (NMT)
- intuition / Intuition behind NMT
- architecture / NMT architecture
- encoder / NMT architecture
- decoder / NMT architecture
- training / Training the NMT
- inference, performing / Inference with NMT
- attention mechanism, improving / Improving the attention mechanism
- hybrid MT models / Hybrid MT models

- neural network
- implementing / Implementing our first neural network
- data, preparing / Preparing the data
- TensorFlow graph, defining / Defining the TensorFlow graph
- executing / Running the neural network
- word embeddings, learning / Learning the word embeddings with a neural network

- neural network-related operations
- about / Neural network-related operations
- nonlinear activations / Nonlinear activations used by neural networks
- convolution operation / The convolution operation
- pooling operation / The pooling operation
- loss, defining / Defining loss
- neural networks, optimization / Optimization of neural networks
- control flow operations / The control flow operations

- newer machine learning models
- about / Newer machine learning models
- Phased LSTM / Phased LSTM
- Dilated Recurrent Neural Networks (DRNNs) / Dilated Recurrent Neural Networks (DRNNs)

- NLP
- current trends / Current trends in NLP

- NLP, for social media
- about / NLP for social media
- rumors, detecting in social media / Detecting rumors in social media
- emotions, detecting in social media / Detecting emotions in social media
- political framing, analyzing in tweets / Analyzing political framing in tweets

- NLP, with computer vision
- combining / Combining NLP with computer vision
- Visual Question Answering (VQA) / Visual Question Answering (VQA)
- caption generation for images, with attention / Caption generation for images with attention

- NLTK
- reference / Tour of WordNet

- NMT, jointly with word embeddings
- training / Training an NMT jointly with word embeddings
- matchings between dataset vocabulary and pretrained embeddings, maximizing / Maximizing matchings between the dataset vocabulary and the pretrained embeddings
- embeddings layer, defining as TensorFlow variable / Defining the embeddings layer as a TensorFlow variable

- NMT architecture
- about / NMT architecture
- embedding layer / The embedding layer
- encoder / The encoder
- context vector / The context vector
- decoder / The decoder

- NMT implementation
- performing, from scratch / Implementing an NMT from scratch – a German to English translator
- word embeddings / Learning word embeddings
- encoder, defining / Defining the encoder and the decoder
- decoder, defining / Defining the encoder and the decoder
- end-to-end output calculation, defining / Defining the end-to-end output calculation
- translation results / Some translation results

- NMTs, improving
- about / Improving NMTs
- teacher forcing / Teacher forcing
- deep LSTMs / Deep LSTMs

- NMT system
- data, preparing / Preparing data for the NMT system

- node / TensorFlow architecture – what happens when you execute the client?
- Noise-Contrastive Estimation (NCE) / Negative sampling of the softmax layer

## O

- object-pair nodes
- about / Language grounding

- one-hot encoded representation / One-hot encoded representation
- one-hot encoding / Classical approaches to learning word representation
- one-to-many RNNs / One-to-many RNNs
- one-to-one RNNs / One-to-one RNNs
- operations
- about / Inputs, variables, outputs, and operations
- defining / Defining TensorFlow operations
- comparison operations / Comparison operations
- mathematical operations / Mathematical operations
- scatter operations / Scatter and gather operations
- gather operations / Scatter and gather operations
- neural network-related operations / Neural network-related operations

- original skip-gram algorithm
- about / The original skip-gram algorithm
- implementing / Implementing the original skip-gram algorithm
- versus improved skip-gram algorithm / Comparing the original skip-gram with the improved skip-gram

- outliers
- inspecting / Inspecting several outliers

- outputs
- about / Inputs, variables, outputs, and operations
- defining / Defining TensorFlow outputs

## P

- parameters
- defining / Defining parameters

- parameters, TensorFlow RNN API
- cell / Using the TensorFlow RNN API
- input_keep_prob / Using the TensorFlow RNN API
- output_keep_prob / Using the TensorFlow RNN API
- state_keep_prob / Using the TensorFlow RNN API
- variational_recurrent / Using the TensorFlow RNN API

- peephole connections
- about / Peephole connections

- perplexity / Perplexity – measuring the quality of the text result
- perplexity over time
- about / Training and validation perplexities over time

- Phased LSTM
- about / Phased LSTM

- placeholder / Feeding data with Python code
- pooling operation
- about / Pooling operation
- max pooling / Max pooling
- average pooling / Average pooling

- pretrained embeddings, using with TensorFlow RNN API
- about / Using pretrained embeddings with TensorFlow RNN API
- pretrained embedding layer, defining / Defining the pretrained embedding layer and the adaptation layer
- adaptation layer, defining / Defining the pretrained embedding layer and the adaptation layer
- LSTM cell, defining / Defining the LSTM cell and softmax layer
- softmax layer, defining / Defining the LSTM cell and softmax layer
- inputs and outputs, defining / Defining inputs and outputs
- images and text, processing differently / Processing images and text differently
- LSTM output calculation, defining / Defining the LSTM output calculation
- logits and predictions, defining / Defining the logits and predictions
- sequence loss, defining / Defining the sequence loss
- optimizer, defining / Defining the optimizer

- pretrained GloVe word vectors
- TensorFlow RNN API, using with / Using TensorFlow RNN API with pretrained GloVe word vectors

- pretrained models / Extracting image features with CNNs
- Principal Component Analysis (PCA) / Finding the matrix inverse – Singular Value Decomposition (SVD)
- probabilistic word embedding
- about / Probabilistic word embedding

- probability
- about / Probability
- random variables / Random variables
- discrete random variables / Discrete random variables
- continuous random variables / Continuous random variables
- mass/density function / The probability mass/density function
- conditional probability / Conditional probability
- joint probability / Joint probability
- marginal probability / Marginal probability
- Bayes' rule / Bayes' rule

- probability density function (PDF) / The probability mass/density function
- probability mass function (PMF) / The probability mass/density function
- PSDVec
- about / Probabilistic word embedding

## R

- r-table
- about / Statistical Machine Translation (SMT)

- random variables / Random variables
- raw text
- to structured data / From raw text to structured data

- Rectified Linear Units (ReLUs) / History of deep learning
- Recurrent Neural Network (RNN) / The current state of deep learning and NLP
- Recurrent Neural Networks (RNNs)
- about / Understanding Recurrent Neural Networks
- modeling / Modeling with Recurrent Neural Networks
- technical description / Technical description of a Recurrent Neural Network
- applications / Applications of RNNs
- used, for text generation / Generating text with RNNs
- text results output, evaluating / Evaluating text results output from the RNN
- with Context Features / Recurrent Neural Networks with Context Features – RNNs with longer memory

- Recurrent Neural Networks (RNNs), applications
- one-to-one RNNs / One-to-one RNNs
- one-to-many RNNs / One-to-many RNNs
- many-to-one RNNs / Many-to-one RNNs
- many-to-many RNNs / Many-to-many RNNs

- region embedding
- about / Region embedding
- input representation / Input representation
- learning / Learning region embeddings
- implementing / Implementation – region embeddings
- classification accuracy / Classification accuracy

- reinforcement learning (RL)
- about / Reinforcement learning
- unique language for communication, teaching to agents / Teaching agents to communicate using their own language
- dialogue agents / Dialogue agents with reinforcement learning

- reordering phase
- about / Statistical Machine Translation (SMT)

- research fields
- penetration into / Penetration into other research fields
- NLP, combining with computer vision / Combining NLP with computer vision
- reinforcement learning / Reinforcement learning
- Generative Adversarial Networks / Generative Adversarial Networks for NLP

- RNNs with Context Features (RNN-CF)
- about / Evaluating text results output from the RNN
- technical description / Technical description of the RNN-CF
- implementing / Implementing the RNN-CF
- hyperparameters, defining / Defining the RNN-CF hyperparameters
- weights, defining / Defining weights of the RNN-CF, Variables and operations for maintaining hidden and context states
- output, calculating / Calculating output
- validation output, calculating / Calculating validation output
- gradients, optimizing / Computing the gradients and optimizing
- gradients, calculating / Computing the gradients and optimizing
- text generated / Text generated with the RNN-CF

- rule-based translation
- about / Rule-based translation

## S

- sarcasm
- about / Detecting sarcasm
- detecting / Detecting sarcasm

- scalar / Scalar
- scatter operation / Scatter and gather operations
- scoping / Reusing variables with scoping
- sentence classification, with CNN
- about / Using CNNs for sentence classification
- pooling over time / Pooling over time
- implementation / Implementation – sentence classification with CNNs

- Seq2Seq models
- chatbots / Other applications of Seq2Seq models – chatbots

- sequence one-hot-encoded vector / Input representation
- Singular Value Decomposition (SVD) / Finding the matrix inverse – Singular Value Decomposition (SVD)
- skimming text, with LSTMs
- about / Skimming text with LSTMs

- skip-gram, versus CBOW
- about / Comparing skip-gram with CBOW, Which is the winner, skip-gram or CBOW?
- performance comparison / Performance comparison

- skip-gram algorithm / The skip-gram algorithm
- raw text, to structured data / From raw text to structured data
- word embeddings, learning with neural network / Learning the word embeddings with a neural network
- implementing, with TensorFlow / Implementing skip-gram with TensorFlow
- implementing / Implementing the original skip-gram algorithm
- extending / More recent algorithms extending skip-gram and CBOW
- limitation / A limitation of the skip-gram algorithm

- softmax layer
- negative sampling / Negative sampling of the softmax layer

- SOS / Preparing captions for feeding into LSTMs
- standard LSTMs
- about / Standard LSTM
- reviewing / Review
- example generated text / Example generated text

- Statistical Machine Translation (SMT)
- about / Statistical Machine Translation (SMT)

- structured data
- from raw text / From raw text to structured data

- structured skip-gram algorithm / The structured skip-gram algorithm
- subsampling
- about / Subsampling – probabilistically ignoring the common words
- implementing / Implementing subsampling

- synset / Tour of WordNet

## T

- t-Distributed Stochastic Neighbor Embedding (t-SNE) / Performance comparison
- tasks emerging
- about / New tasks emerging
- sarcasm, detecting / Detecting sarcasm
- language grounding / Language grounding
- skimming text, with LSTMs / Skimming text with LSTMs

- teacher forcing
- about / Teacher forcing

- technical tools
- about / Introduction to the technical tools
- describing / Description of the tools
- Python, installing / Installing Python and scikit-learn
- scikit-learn, installing / Installing Python and scikit-learn
- Jupyter Notebook, installing / Installing Jupyter Notebook
- TensorFlow, installing / Installing TensorFlow

- tensor / Getting started with TensorFlow, Tensors
- TensorBoard
- word embeddings, visualizing / Visualizing word embeddings with TensorBoard
- starting with / Starting TensorBoard
- word embeddings, saving / Saving word embeddings and visualizing via TensorBoard
- visualizing / Saving word embeddings and visualizing via TensorBoard

- TensorFlow
- URL, for installing / Installing TensorFlow
- about / What is TensorFlow?
- reference / What is TensorFlow?
- using / Getting started with TensorFlow, Building an input pipeline
- architecture / TensorFlow architecture – what happens when you execute the client?
- architecture, reference / TensorFlow architecture – what happens when you execute the client?
- Cafe Le TensorFlow / Cafe Le TensorFlow – understanding TensorFlow with an analogy
- Continuous Bag-Of-Words algorithm, implementing / Implementing CBOW in TensorFlow

- TensorFlow client / TensorFlow client in detail
- TensorFlow implementation
- URL / One Model to Learn Them All

- TensorFlow placeholders
- enc_train_inputs / Defining the encoder and the decoder
- dec_train_inputs / Defining the encoder and the decoder
- dec_train_labels / Defining the encoder and the decoder
- dec_train_masks / Defining the encoder and the decoder

- TensorFlow Research Cloud (TFRC)
- URL / Installing TensorFlow

- TensorFlow RNN API
- using / Using the TensorFlow RNN API
- using, with pretrained GloVe word vectors / Using TensorFlow RNN API with pretrained GloVe word vectors
- pretrained embeddings, using with / Using pretrained embeddings with TensorFlow RNN API

- TensorFlow seq2seq library
- about / Introduction to the TensorFlow seq2seq library
- embeddings, defining for encoder and decoder / Defining embeddings for the encoder and decoder
- encoder, defining / Defining the encoder
- decoder, defining / Defining the decoder

- Term Frequency-Inverse Document Frequency (TF-IDF) / Classical approaches to learning word representation
- text generation, with RNNs
- about / Generating text with RNNs, Defining hyperparameters
- inputs over time, unrolling for Truncated BPTT / Unrolling the inputs over time for Truncated BPTT
- inputsvalidation dataset, defining / Defining the validation dataset
- weights and biases, defining / Defining weights and biases
- state persisting variables, defining / Defining state persisting variables
- hidden states and outputs, calculating with unrolled inputs / Calculating the hidden states and outputs with unrolled inputs
- loss, calculating / Calculating the loss
- validation output, calculating / Calculating validation output
- gradients, calculating / Calculating gradients and optimizing
- optimizing / Calculating gradients and optimizing
- generated chunk of text, outputting / Outputting a freshly generated chunk of text

- text generation, with words in LSTMs
- about / Improving LSTMs – generating text with words instead of n-grams
- curse of dimensionality / The curse of dimensionality
- Word2vec / Word2vec to the rescue
- text, generating with Word2vec / Generating text with Word2vec
- perplexity over time / Perplexity over time

- text result
- output, evaluating from RNN / Evaluating text results output from the RNN
- quality, measuring / Perplexity – measuring the quality of the text result

- TF-IDF method / The TF-IDF method
- Topical Word Embeddings (TWE)
- about / Topic embedding

- topic embedding
- about / Topic embedding

- translation phase
- about / Statistical Machine Translation (SMT)

- transpose
- about / Transpose
- example / Transpose

- transposed convolution / Transposed convolution
- Truncated Backpropagation Through Time (TBPTT)
- about / Backpropagation Through Time – training RNNs
- RNNs, training / Truncated BPTT – training RNNs efficiently
- limitations / Limitations of BPTT – vanishing and exploding gradients
- exploding gradient / Limitations of BPTT – vanishing and exploding gradients
- vanishing gradient / Limitations of BPTT – vanishing and exploding gradients

- Turing test
- about / Evaluating chatbots – Turing test

- tv-embedding
- about / Region embedding

## U

- unigram-based negative sampling
- implementing / Implementing unigram-based negative sampling

- unigram distribution
- using, for negative sampling / Using the unigram distribution for negative sampling

## V

- vanishing gradients phenomenon / History of deep learning
- variables
- about / Inputs, variables, outputs, and operations
- defining / Defining variables in TensorFlow
- reusing, with scoping / Reusing variables with scoping

- variants, LSTMs
- peephole connections / Peephole connections
- GRUs / Gated Recurrent Units

- variational inference
- about / Probabilistic word embedding

- vectors / Vectors
- velocity term / Limitations of BPTT – vanishing and exploding gradients
- VGG-16
- predicting with / Predicting class probabilities with VGG-16

- VGG-16 inferring
- about / Inferring VGG-16

- VGG CNN
- URL / Extracting image features with CNNs

- Virtual Assistants (VAs)
- about / What is Natural Language Processing?

- Visual Question Answering (VQA)
- about / Visual Question Answering (VQA)

## W

- weights loading, CNN
- implementation / Implementation – loading weights and inferencing with VGG-
- variables, building / Building and updating variables
- variables, updating / Building and updating variables
- inputs, preprocessing / Preprocessing inputs
- vectorized representations of images, extracting / Extracting vectorized representations of images

- whitening / Preparing the data
- Word2vec
- about / Word2vec – a neural network-based approach to learning word representation, Word2vec to the rescue
- exercise / Exercise: is queen = king – he + she?
- loss function, designing for learning word embeddings / Designing a loss function for learning word embeddings
- document classification / Document classification with Word2vec
- text, generating with / Generating text with Word2vec

- word alignment problem / Tasks of Natural Language Processing
- word embeddings
- learning, with neural network / Learning the word embeddings with a neural network
- documents, classifying with / Classifying documents with word embeddings
- learning / Implementation – learning word embeddings
- to document embeddings / Implementation – word embeddings to document embeddings
- about / Learning word embeddings, Word embeddings
- region embedding / Region embedding
- probabilistic word embedding / Probabilistic word embedding
- ensemble embedding / Ensemble embedding
- visualizing, with TensorBoard / Visualizing word embeddings with TensorBoard

- word embeddings algorithms
- about / Extensions to the word embeddings algorithms

- word meaning / What is a word representation or meaning?
- WordNet
- about / WordNet – using an external lexical knowledge base for learning word representations, Tour of WordNet
- reference / Tour of WordNet
- issues / Problems with WordNet

- word representation / What is a word representation or meaning?
- learning, classical approaches / Classical approaches to learning word representation
- one-hot encoded representation / One-hot encoded representation
- TF-IDF method / The TF-IDF method
- co-occurrence matrix / Co-occurrence matrix

- word vectors
- using / Using word vectors

## X

- Xavier initialization / Defining parameters, Defining the encoder and the decoder