Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By : Jacob Perkins
Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By: Jacob Perkins

Overview of this book

Table of Contents (17 chapters)
Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Penn Treebank Part-of-speech Tags
Index

Expanding and removing chunks with regular expressions


There are three RegexpChunkRule subclasses that are not supported by RegexpChunkRule.fromstring() or RegexpParser, and therefore must be created manually if you want to use them. These rules are as follows:

  • ExpandLeftRule: Add unchunked (chink) words to the left of a chunk

  • ExpandRightRule: Add unchunked (chink) words to the right of a chunk

  • UnChunkRule: Unchunk any matching chunk

How to do it...

ExpandLeftRule and ExpandRightRule both take two patterns along with a description as arguments. For ExpandLeftRule, the first pattern is the chink we want to add to the beginning of the chunk, while the right pattern will match the beginning of the chunk we want to expand. With ExpandRightRule, the left pattern should match the end of the chunk we want to expand, and the right pattern matches the chink we want to add to the end of the chunk. The idea is similar to the MergeRule class, but in this case, we're merging chink words instead of other...