Book Image

Natural Language Processing with Python Quick Start Guide

By : Nirant Kasliwal
Book Image

Natural Language Processing with Python Quick Start Guide

By: Nirant Kasliwal

Overview of this book

NLP in Python is among the most sought after skills among data scientists. With code and relevant case studies, this book will show how you can use industry-grade tools to implement NLP programs capable of learning from relevant data. We will explore many modern methods ranging from spaCy to word vectors that have reinvented NLP. The book takes you from the basics of NLP to building text processing applications. We start with an introduction to the basic vocabulary along with a work?ow for building NLP applications. We use industry-grade NLP tools for cleaning and pre-processing text, automatic question and answer generation using linguistics, text embedding, text classifier, and building a chatbot. With each project, you will learn a new concept of NLP. You will learn about entity recognition, part of speech tagging and dependency parsing for Q and A. We use text embedding for both clustering documents and making chatbots, and then build classifiers using scikit-learn. We conclude by deploying these models as REST APIs with Flask. By the end, you will be confident building NLP applications, and know exactly what to look for when approaching new challenges.
Table of Contents (10 chapters)

What is NLP?

Natural language processing is the use of machines to manipulate natural language. In this book, we will focus on written language, or in simpler words: text.

In effect, this is a practitioner's guide to text processing in English.

Humans are the only known species to have developed written languages. Yet, children don't learn to read and write on their own. This is to highlight the complexity of text processing and NLP.

The study of natural language processing has been around for more than 50 years. The famous Turing test for general artificial intelligence uses this language. This field has grown both in regard to linguistics and its computational techniques.

In the spirit of being able to build things first, we will learn how to build a simple text classification system using Python's scikit-learn and no other dependencies.

We will also address if this book is a good pick for you.

Let's get going!

Why learn about NLP?

The best way to get the most about of this book is by knowing what you want NLP to do for you.

A variety of reasons might draw you to NLP. It might be the higher earning potential. Maybe you've noticed and are excited by the potential of NLP, for example, regarding Uber's customer Service bots. Yes, they mostly use bots to answer your complaints instead of humans.

It is useful to know your motivation and write it down. This will help you select problems and projects that excite you. It will also help you be selective when reading this book. This is not an NLP Made Easy or similar book. Let's be honest: this is a challenging topic. Writing down your motivations is a helpful reminder.

As a legal note, the accompanying code has a permissive MIT License. You can use it at your work without legal hassle. That being said, each dependent library is from a third party, and you should definitely check if they allow commercial use or not.

I don't expect you to be able to use all of the tools and techniques mentioned here. Cherry-pick things that make sense.

You have a problem in mind

You already have a problem in mind, such as an academic project or a problem at your work.

Are you looking for the best tools and techniques that you could use to get off the ground?

First, flip through to the book's index to check if I have covered your problem here. I have shared end-to-end solutions for some of the most common use cases here. If it is not shared, fret not—you are still covered. The underlying techniques for a lot of tasks are common. I have been careful to select methods that are useful to a wider audience.

Technical achievement

Is learning a mark of achievement for you?

NLP and, more generally, data science, are popular terms. You are someone who wants to keep up. You are someone who takes joy from learning new tools, techniques, and technologies. This is your next big challenge. This is your chance to prove your ability to self-teach and meet mastery.

If this sounds like you, you may be interested in using this as a reference book. I have dedicated sections where we give you enough understanding of a method. I show you how to use it without having to dive down into the latest papers. This is an invitation to learning more, and you are not encouraged to stop here. Try these code samples out for yourself!

Do something new

You have some domain expertise. Now, you want to do things in your domain that are not possible without these skills. One way to figure out new possibilities is to combine your domain expertise with what you learn here. There are several very large opportunities that I saw as I wrote this book, including the following:

  • NLP for non-English languages such as Hindi, Tamil, or Telugu.
  • Specialized NLP for your domain, for example, finance and Bollywood have different languages in their own ways. Your models that have been trained on Bollywood news are not expected to work for finance.

If this sounds like you, you want to pay attention to the text pre-processing sections in this book. These sections will help you understand how we make text ready for machine consumption.

Is this book for you?

This book has been written so that it keeps the preceding use cases and mindsets in mind. The methods, technologies, tools, and techniques selected here are a fine balance of industry-grade stability and academia-grade results quality. There are several tools, such as parfit, and Flashtext, and ideas such as LIME, that have never been written about in the context of NLP.

Lastly, I understand the importance and excitement of deep learning methods and have a dedicated chapter on deep learning for NLP methods.