Natural Language Processing with TensorFlow

Natural Language Processing with TensorFlow

By : Motaz Saad, Thushan Ganegedara

Buy this Book

Natural Language Processing with TensorFlow

By: Motaz Saad, Thushan Ganegedara

Buy this Book

Overview of this book

Natural language processing (NLP) supplies the majority of data available to deep learning applications, while TensorFlow is the most important deep learning framework currently available. Natural Language Processing with TensorFlow brings TensorFlow and NLP together to give you invaluable tools to work with the immense volume of unstructured data in today’s data streams, and apply these tools to specific NLP tasks. Thushan Ganegedara starts by giving you a grounding in NLP and TensorFlow basics. You'll then learn how to use Word2vec, including advanced extensions, to create word embeddings that turn sequences of words into vectors accessible to deep learning algorithms. Chapters on classical deep learning algorithms, like convolutional neural networks (CNN) and recurrent neural networks (RNN), demonstrate important NLP tasks as sentence classification and language generation. You will learn how to apply high-performance RNN models, like long short-term memory (LSTM) cells, to NLP tasks. You will also explore neural machine translation and implement a neural machine translator. After reading this book, you will gain an understanding of NLP and you'll have the skills to apply TensorFlow in deep learning NLP applications, and how to perform specific NLP tasks.

Natural Language Processing with TensorFlow

Contributors

Preface

Free Chapter

Introduction to Natural Language Processing

What is Natural Language Processing?

Tasks of Natural Language Processing

The traditional approach to Natural Language Processing

The deep learning approach to Natural Language Processing

The roadmap – beyond this chapter

Introduction to the technical tools

Summary

Understanding TensorFlow

What is TensorFlow?

Inputs, variables, outputs, and operations

Reusing variables with scoping

Implementing our first neural network

Summary

Word2vec – Learning Word Embeddings

What is a word representation or meaning?

Classical approaches to learning word representation

Word2vec – a neural network-based approach to learning word representation

The skip-gram algorithm

The Continuous Bag-of-Words algorithm

Summary

Advanced Word2vec

The original skip-gram algorithm

Comparing skip-gram with CBOW

Extensions to the word embeddings algorithms

More recent algorithms extending skip-gram and CBOW

GloVe – Global Vectors representation

Document classification with Word2vec

Summary

Sentence Classification with Convolutional Neural Networks

Introducing Convolution Neural Networks

Understanding Convolution Neural Networks

Exercise – image classification on MNIST with CNN

Using CNNs for sentence classification

Summary

Recurrent Neural Networks

Understanding Recurrent Neural Networks

Backpropagation Through Time

Applications of RNNs

Generating text with RNNs

Evaluating text results output from the RNN

Perplexity – measuring the quality of the text result

Recurrent Neural Networks with Context Features – RNNs with longer memory

Summary

Long Short-Term Memory Networks

Understanding Long Short-Term Memory Networks

How LSTMs solve the vanishing gradient problem

Other variants of LSTMs

Summary

Applications of LSTM – Generating Text

Our data

Implementing an LSTM

Comparing LSTMs to LSTMs with peephole connections and GRUs

Improving LSTMs – beam search

Improving LSTMs – generating text with words instead of n-grams

Using the TensorFlow RNN API

Summary

Applications of LSTM – Image Caption Generation

Getting to know the data

The machine learning pipeline for image caption generation

Extracting image features with CNNs

Implementation – loading weights and inferencing with VGG-

Learning word embeddings

Preparing captions for feeding into LSTMs

Generating data for LSTMs

Defining the LSTM

Evaluating the results quantitatively

Captions generated for test images

Using TensorFlow RNN API with pretrained GloVe word vectors

Summary

Sequence-to-Sequence Learning – Neural Machine Translation

Machine translation

A brief historical tour of machine translation

Understanding Neural Machine Translation

Preparing data for the NMT system

Training the NMT

Inference with NMT

The BLEU score – evaluating the machine translation systems

Implementing an NMT from scratch – a German to English translator

Training an NMT jointly with word embeddings

Improving NMTs

Attention

Other applications of Seq2Seq models – chatbots

Summary

Current Trends and the Future of Natural Language Processing

Current trends in NLP

Penetration into other research fields

Towards Artificial General Intelligence

NLP for social media

New tasks emerging

Newer machine learning models

Summary

References

Mathematical Foundations and Advanced TensorFlow

Basic data structures

Special types of matrices

Tensor/matrix operations

Probability

Introduction to Keras

Introduction to the TensorFlow seq2seq library

Visualizing word embeddings with TensorBoard

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

The original skip-gram algorithm

The skip-gram algorithm discussed up to this point in the book is actually an improvement over the original skip-gram algorithm proposed in the original paper by Mikolov and others, published in 2013. In this paper, the algorithm did not use an intermediate hidden layer to learn the representations. In contrast, the original algorithm used two different embedding or projection layers (the input and output embeddings in Figure 4.1) and defined a cost function derived from the embeddings themselves:

Figure 4.1: The original skip-gram algorithm without hidden layers

The original negative sampled loss was defined as follows:

Here, v is the input embeddings layer, v' is the output word embeddings layer, corresponds to the embedding vector for the word w_i in the input embeddings layer and corresponds to the word vector for the word w_i in the output embeddings layer.