Book Image

Hands-On Natural Language Processing with Python

By : Rajesh Arumugam, Rajalingappaa Shanmugamani, Auguste Byiringiro, Chaitanya Joshi, Karthik Muthuswamy
Book Image

Hands-On Natural Language Processing with Python

By: Rajesh Arumugam, Rajalingappaa Shanmugamani, Auguste Byiringiro, Chaitanya Joshi, Karthik Muthuswamy

Overview of this book

Natural language processing (NLP) has found its application in various domains, such as web search, advertisements, and customer services, and with the help of deep learning, we can enhance its performances in these areas. Hands-On Natural Language Processing with Python teaches you how to leverage deep learning models for performing various NLP tasks, along with best practices in dealing with today’s NLP challenges. To begin with, you will understand the core concepts of NLP and deep learning, such as Convolutional Neural Networks (CNNs), recurrent neural networks (RNNs), semantic embedding, Word2vec, and more. You will learn how to perform each and every task of NLP using neural networks, in which you will train and deploy neural networks in your NLP applications. You will get accustomed to using RNNs and CNNs in various application areas, such as text classification and sequence labeling, which are essential in the application of sentiment analysis, customer service chatbots, and anomaly detection. You will be equipped with practical knowledge in order to implement deep learning in your linguistic applications using Python's popular deep learning library, TensorFlow. By the end of this book, you will be well versed in building deep learning-backed NLP applications, along with overcoming NLP challenges with best practices developed by domain experts.
Table of Contents (15 chapters)
6
Searching and DeDuplicating Using CNNs
7
Named Entity Recognition Using Character LSTM

Topic modeling

When we have a collection of documents for which we do not clearly know the categories, topic models help us to roughly find the categorization. The model treats each document as a mixture of topics, probably with one dominating topic.

For example, let's suppose we have the following sentences:

  • Eating fruits as snacks is a healthy habit
  • Exercising regularly is an important part of a healthy lifestyle
  • Grapefruit and oranges are citrus fruits

A topic model of these sentences may output the following:

  • Topic A: 40% healthy, 20% fruits, 10% snacks
  • Topic B: 20% Grapefruit, 20% oranges, 10% citrus
  • Sentence 1 and 2: 80% Topic A, 20% Topic B
  • Sentence 3: 100% Topic B

From the output of the model, we can guess that Topic A is about health and Topic B is about fruits. Though these topics are not known apriori, the model outputs corresponding probabilities for words...