Book Image

Natural Language Processing with TensorFlow

By : Motaz Saad, Thushan Ganegedara
Book Image

Natural Language Processing with TensorFlow

By: Motaz Saad, Thushan Ganegedara

Overview of this book

Natural language processing (NLP) supplies the majority of data available to deep learning applications, while TensorFlow is the most important deep learning framework currently available. Natural Language Processing with TensorFlow brings TensorFlow and NLP together to give you invaluable tools to work with the immense volume of unstructured data in today’s data streams, and apply these tools to specific NLP tasks. Thushan Ganegedara starts by giving you a grounding in NLP and TensorFlow basics. You'll then learn how to use Word2vec, including advanced extensions, to create word embeddings that turn sequences of words into vectors accessible to deep learning algorithms. Chapters on classical deep learning algorithms, like convolutional neural networks (CNN) and recurrent neural networks (RNN), demonstrate important NLP tasks as sentence classification and language generation. You will learn how to apply high-performance RNN models, like long short-term memory (LSTM) cells, to NLP tasks. You will also explore neural machine translation and implement a neural machine translator. After reading this book, you will gain an understanding of NLP and you'll have the skills to apply TensorFlow in deep learning NLP applications, and how to perform specific NLP tasks.
Table of Contents (16 chapters)
Natural Language Processing with TensorFlow
Contributors
Preface
Index

Chapter 4. Advanced Word2vec

In Chapter 3, Word2vec – Learning Word Embeddings, we introduced you to Word2vec, the basics of learning word embeddings, and the two common Word2vec algorithms: skip-gram and CBOW. In this chapter, we will discuss several topics related to Word2vec, focusing on these two algorithms and extensions.

First, we will explore how the original skip-gram algorithm was implemented and how it compares to its more modern variant, which we used in Chapter 3, Word2vec – Learning Word Embeddings. We will examine the differences between skip-gram and CBOW and look at the behavior of the loss over time of the two approaches. We will also discuss which method works better, using both our observation and the available literature.

We will discuss several extensions to the existing Word2vec methods that boost performance. These extensions include using more effective sampling techniques to sample negative examples for negative sampling and ignoring uninformative words in the learning...