Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By : Jacob Perkins
Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By: Jacob Perkins

Overview of this book

Table of Contents (17 chapters)
Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Penn Treebank Part-of-speech Tags
Index

Distributed chunking with execnet


In this recipe, we'll do chunking and tagging over an execnet gateway. This will be very similar to the tagging in the previous recipe, but we'll be sending two objects instead of one, and we will be receiving a Tree instead of a list, which requires pickling and unpickling for serialization.

Getting ready

As in the previous recipe, you must have execnet installed.

How to do it...

The setup code is very similar to the last recipe, and we'll use the same pickled tagger as well. First, we'll pickle the default chunker used by nltk.chunk.ne_chunk(), though any chunker would do. Next, we make a gateway for the remote_chunk module, get a channel, and send the pickled tagger and chunker over. Then, we receive a pickled Tree, which we can unpickle and inspect to see the result. Finally, we exit the gateway:

>>> import execnet, remote_chunk
>>> import nltk.data, nltk.tag, nltk.chunk
>>> import pickle
>>> from nltk.corpus import treebank_chunk...