Book Image

Hands-On Natural Language Processing with PyTorch 1.x

By : Thomas Dop
Book Image

Hands-On Natural Language Processing with PyTorch 1.x

By: Thomas Dop

Overview of this book

In the internet age, where an increasing volume of text data is generated daily from social media and other platforms, being able to make sense of that data is a crucial skill. With this book, you’ll learn how to extract valuable insights from text by building deep learning models for natural language processing (NLP) tasks. Starting by understanding how to install PyTorch and using CUDA to accelerate the processing speed, you’ll explore how the NLP architecture works with the help of practical examples. This PyTorch NLP book will guide you through core concepts such as word embeddings, CBOW, and tokenization in PyTorch. You’ll then learn techniques for processing textual data and see how deep learning can be used for NLP tasks. The book demonstrates how to implement deep learning and neural network architectures to build models that will allow you to classify and translate text and perform sentiment analysis. Finally, you’ll learn how to build advanced NLP models, such as conversational chatbots. By the end of this book, you’ll not only have understood the different NLP problems that can be solved using deep learning with PyTorch, but also be able to build models to solve them.
Table of Contents (14 chapters)
1
Section 1: Essentials of PyTorch 1.x for NLP
7
Section 3: Real-World NLP Applications Using PyTorch 1.x

NLP for machine learning

Unlike humans, computers do not understand text – at least not in the same way that we do. In order to create machine learning models that are able to learn from data, we must first learn to represent natural language in a way that computers are able to process.

When we discussed machine learning fundamentals, you may have noticed that loss functions all deal with numerical data so as to be able to minimize loss. Because of this, we wish to represent our text in a numerical format that can form the basis of our input into a neural network. Here, we will cover a couple of basic ways of numerically representing our data. 

Bag-of-words

The first and most simple way of representing text is by using a bag-of-words representation. This method simply counts the words in a given sentence or document and counts all the words. These counts are then transformed into a vector where each element of the vector is the count of the times each word in the...