Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By : Jacob Perkins
Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By: Jacob Perkins

Overview of this book

Table of Contents (17 chapters)
Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Penn Treebank Part-of-speech Tags
Index

Chapter 9. Parsing Specific Data Types

In this chapter, we will cover the following recipes:

  • Parsing dates and times with dateutil

  • Timezone lookup and conversion

  • Extracting URLs from HTML with lxml

  • Cleaning and stripping HTML

  • Converting HTML entities with BeautifulSoup

  • Detecting and converting character encodings