Book Image

Hands-On Natural Language Processing with PyTorch 1.x

By : Thomas Dop
Book Image

Hands-On Natural Language Processing with PyTorch 1.x

By: Thomas Dop

Overview of this book

In the internet age, where an increasing volume of text data is generated daily from social media and other platforms, being able to make sense of that data is a crucial skill. With this book, you’ll learn how to extract valuable insights from text by building deep learning models for natural language processing (NLP) tasks. Starting by understanding how to install PyTorch and using CUDA to accelerate the processing speed, you’ll explore how the NLP architecture works with the help of practical examples. This PyTorch NLP book will guide you through core concepts such as word embeddings, CBOW, and tokenization in PyTorch. You’ll then learn techniques for processing textual data and see how deep learning can be used for NLP tasks. The book demonstrates how to implement deep learning and neural network architectures to build models that will allow you to classify and translate text and perform sentiment analysis. Finally, you’ll learn how to build advanced NLP models, such as conversational chatbots. By the end of this book, you’ll not only have understood the different NLP problems that can be solved using deep learning with PyTorch, but also be able to build models to solve them.
Table of Contents (14 chapters)
1
Section 1: Essentials of PyTorch 1.x for NLP
7
Section 3: Real-World NLP Applications Using PyTorch 1.x

Introducing LSTMs

While RNNs allow us to use sequences of words as input to our models, they are far from perfect. RNNs suffer from two main flaws, which can be partially remedied by using a more sophisticated version of the RNN, known as LSTM.

The basic structure of RNNs means that it is very difficult for them to retain information long term. Consider a sentence that's 20 words long. From our first word in the sentence affecting the initial hidden state to the last word in the sentence, our hidden state is updated 20 times. From the beginning of our sentence to our final hidden state, it is very difficult for an RNN to retain information about words at the beginning of the sentence. This means that RNNs aren't very good at capturing long-term dependencies within sequences. This also ties in with the vanishing gradient problem mentioned earlier, where it is very inefficient to backpropagate through long, sparse sequences of vectors.

Consider a long paragraph where...