Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By : Jacob Perkins
Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By: Jacob Perkins

Overview of this book

Table of Contents (17 chapters)
Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Penn Treebank Part-of-speech Tags
Index

Storing a conditional frequency distribution in Redis


The nltk.probability.ConditionalFreqDist class is a container for FreqDist instances, with one FreqDist per condition. It is used to count frequencies that are dependent on another condition, such as another word or a class label. We used this class in the Calculating high information words recipe in Chapter 7, Text Classification. Here, we'll create an API-compatible class on top of Redis using the RedisHashFreqDist from the previous recipe.

Getting ready

As in the previous recipe, you'll need to have Redis and redis-py installed with an instance of redis-server running.

How to do it...

We define a RedisConditionalHashFreqDist class in redisprob.py that extends nltk.probability.ConditionalFreqDist and overrides the __getitem__() method. We override __getitem__() so we can create an instance of RedisHashFreqDist instead of a FreqDist:

from nltk.probability import ConditionalFreqDist
from rediscollections import encode_key

class RedisConditionalHashFreqDist...