Text Processing Using NLTK in Python [Video]

Text Processing Using NLTK in Python [Video]

By : Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti

Buy this Video

Text Processing Using NLTK in Python [Video]

By: Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti

Buy this Video

Overview of this book

Natural Language Processing (NLP) is a feature of Artificial Intelligence concerned with the interactions between computers and human (natural) languages. This course includes unique videos that will teach you various aspects of performing Natural Language Processing with NLTK—the leading Python platform for the task. In this course, you will learn what WordNet is and explore its features and usage. It will teach how to extract raw text from web sources and introduce some critical pre-processing steps. You will also get familiarized with the concept of pattern matching as a way to do text analysis. By the end of the course, you will be confident & have covered various solutions, covering natural language understanding, Natural Language Processing, and syntactic analysis. All the code and supporting files for this course are available on Github at <a style="color: #fa8d11;" href="https://github.com/PacktPublishing/Text-Processing-using-NLTK-in-Python" target="blank">https://github.com/PacktPublishing/Text-Processing-using-NLTK-in-Python</a> <h1>Style and Approach</h1> This video course takes a solution-based approach where every topic is explicated with the help of a real-world example.

Free Chapter

Corpus and WordNet

The Course Overview

Accessing In-Built Corpora

Downloading an External Corpus

Counting All the wh-words

Frequency Distribution Operations

WordNet

The Concepts of Hyponyms and Hypernyms Using WordNet

Compute the Average Polysemy According to WordNet

Raw Text, Sourcing, and Normalization

The Importance of String Operations

Getting Deeper with String Operations

Reading a PDF File in Python

Reading Word Documents in Python

Creating a User-Defined Corpus

Reading Contents from an RSS Feed

HTML Parsing Using BeautifulSoup

Pre-Processing

Tokenization – Learning to Use the Inbuilt Tokenizers of NLTK

Stemming – Learning to Use the Inbuilt Stemmers of NLTK

Lemmatization – Learning to Use the WordNetLemmatizer of NLTK

Stopwords – Learning to Use the Stopwords Corpus

Edit Distance – Writing Your Own Algorithm to Find Edit Distance Between Two Strings

Processing Two Short Stories and Extracting the Common Vocabulary

Regular Expressions

Regular Expression – Learning to Use *, +, and ?

Regular Expression – Learning to Use Non-Start and Non-End of Word

Searching Multiple Literal Strings and Substrings Occurrences

Creating Date Regex

Making Abbreviations

Learning to Write Your Own Regex Tokenizer

Learning to Write Your Own Regex Stemmer

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Chapter 3

Pre-Processing

Section 4

Stopwords – Learning to Use the Stopwords Corpus

We will be using the Gutenberg corpus as an example in this recipe. The Gutenberg corpus is part of the NLTK data module. It contains a selection of 18 texts from some 25,000 electronic books from the project Gutenberg text archives. - Print names of all 18 Gutenberg texts - Do a little preprocessing step on the list of all words from the corpus - Access nltk.corpus.stopwords and do stopwords

Text Processing Using NLTK in Python [Video]

By : Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti

Text Processing Using NLTK in Python [Video]

By: Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti

Overview of this book

Related Content you might be interested in

Current Title:

Text Processing Using NLTK in Python [Video]