Book Image

Text Processing Using NLTK in Python [Video]

By : Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti
Book Image

Text Processing Using NLTK in Python [Video]

By: Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti

Overview of this book

<p>Natural Language Processing (NLP) is a feature of Artificial Intelligence concerned with the interactions between computers and human (natural) languages. This course includes unique videos that will teach you various aspects of performing Natural Language Processing with NLTK—the leading Python platform for the task.</p> <p>In this course, you will learn what WordNet is and explore its features and usage. It will teach how to extract raw text from web sources and introduce some critical pre-processing steps. You will also get familiarized with the concept of pattern matching as a way to do text analysis.</p> <p>By the end of the course, you will be confident &amp; have covered various solutions, covering natural language understanding, Natural Language Processing, and syntactic analysis.</p> <p>All the code and supporting files for this course are available on Github at <a style="color: #fa8d11;" href="https://github.com/PacktPublishing/Text-Processing-using-NLTK-in-Python" target="blank">https://github.com/PacktPublishing/Text-Processing-using-NLTK-in-Python</a></p> <h1>Style and Approach</h1> <p>This video course takes a solution-based approach where every topic is explicated with the help of a real-world example.</p>
Table of Contents (4 chapters)
Chapter 3
Pre-Processing
Content Locked
Section 4
Stopwords – Learning to Use the Stopwords Corpus
We will be using the Gutenberg corpus as an example in this recipe. The Gutenberg corpus is part of the NLTK data module. It contains a selection of 18 texts from some 25,000 electronic books from the project Gutenberg text archives. - Print names of all 18 Gutenberg texts - Do a little preprocessing step on the list of all words from the corpus - Access nltk.corpus.stopwords and do stopwords