Book Image

Hands-On Natural Language Processing with Python

By : Rajesh Arumugam, Rajalingappaa Shanmugamani, Auguste Byiringiro, Chaitanya Joshi, Karthik Muthuswamy
Book Image

Hands-On Natural Language Processing with Python

By: Rajesh Arumugam, Rajalingappaa Shanmugamani, Auguste Byiringiro, Chaitanya Joshi, Karthik Muthuswamy

Overview of this book

Natural language processing (NLP) has found its application in various domains, such as web search, advertisements, and customer services, and with the help of deep learning, we can enhance its performances in these areas. Hands-On Natural Language Processing with Python teaches you how to leverage deep learning models for performing various NLP tasks, along with best practices in dealing with today’s NLP challenges. To begin with, you will understand the core concepts of NLP and deep learning, such as Convolutional Neural Networks (CNNs), recurrent neural networks (RNNs), semantic embedding, Word2vec, and more. You will learn how to perform each and every task of NLP using neural networks, in which you will train and deploy neural networks in your NLP applications. You will get accustomed to using RNNs and CNNs in various application areas, such as text classification and sequence labeling, which are essential in the application of sentiment analysis, customer service chatbots, and anomaly detection. You will be equipped with practical knowledge in order to implement deep learning in your linguistic applications using Python's popular deep learning library, TensorFlow. By the end of this book, you will be well versed in building deep learning-backed NLP applications, along with overcoming NLP challenges with best practices developed by domain experts.
Table of Contents (15 chapters)
6
Searching and DeDuplicating Using CNNs
7
Named Entity Recognition Using Character LSTM

To get the most out of this book

The prerequisites for the book are basic knowledge of ML or deep learning and intermediate Python skills, although both are not mandatory. We have given a brief introduction to deep learning, touching upon topics such as multi-layer perceptrons, Convolutional Neural Networks (CNNs), and RNNs in Chapter 1, Getting Started. It would be helpful if the reader knows general ML concepts, such as overfitting and model regularization, and classical models, such as linear regression and random forest. In more advanced chapters, the reader might encounter in-depth code walkthroughs that expect at least a basic level of Python programming experience.

All the code examples in the book can be downloaded from the code book repository as described in the next section. The examples mainly utilize open source tools and open data repositories, and were written for Python 3.5 or higher. The major libraries that are extensively used throughout the book are TensorFlow and NLTK. Detailed installation instructions for these packages can be found in Chapter 1, Getting Started, and Chapter 2, Text Classification and POS Tagging Using NLTK, respectively. Though a GPU is not required for the examples to run, it is advisable to have a system that has one. We recommend training models from the second half of the book on a GPU, as more complicated tasks involve bigger models and larger datasets.

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packtpub.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Natural-Language-Processing-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "The pip installer can be used to install NLTK, with an optional installation of numpy."

A block of code is set as follows:

>>> large_words = dict([(k,v) for k,v in frequency_dist.items() if len(k)>3])
>>> frequency_dist = nltk.FreqDist(large_words)
>>> frequency_dist.plot(50,cumulative=False)

Any command-line input or output is written as follows:

import nltk
nltk.download()

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Navigate to stopwords and install it for future use."

Warnings or important notes appear like this.
Tips and tricks appear like this.