Book Image

Text Processing Using NLTK in Python [Video]

By : Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti
Book Image

Text Processing Using NLTK in Python [Video]

By: Krishna Bhavsar, V Naresh Kumar, Pratap Dangeti

Overview of this book

<p>Natural Language Processing (NLP) is a feature of Artificial Intelligence concerned with the interactions between computers and human (natural) languages. This course includes unique videos that will teach you various aspects of performing Natural Language Processing with NLTK—the leading Python platform for the task.</p> <p>In this course, you will learn what WordNet is and explore its features and usage. It will teach how to extract raw text from web sources and introduce some critical pre-processing steps. You will also get familiarized with the concept of pattern matching as a way to do text analysis.</p> <p>By the end of the course, you will be confident &amp; have covered various solutions, covering natural language understanding, Natural Language Processing, and syntactic analysis.</p> <p>All the code and supporting files for this course are available on Github at <a style="color: #fa8d11;" href="https://github.com/PacktPublishing/Text-Processing-using-NLTK-in-Python" target="blank">https://github.com/PacktPublishing/Text-Processing-using-NLTK-in-Python</a></p> <h1>Style and Approach</h1> <p>This video course takes a solution-based approach where every topic is explicated with the help of a real-world example.</p>
Table of Contents (4 chapters)
Chapter 2
Raw Text, Sourcing, and Normalization
Content Locked
Section 4
Reading Word Documents in Python
In this video, we will see how to load and read Word/DOCX documents. The libraries available for reading DOCX word documents are more comprehensive, in that we can also see paragraph boundaries, text styles, and do what are called runs. - Create a new Python file named word.py - Define the function getTextWord - Read a DOCX document and print the full contents