Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By : Jacob Perkins
Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By: Jacob Perkins

Overview of this book

Table of Contents (17 chapters)
Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Penn Treebank Part-of-speech Tags
Index

Storing a frequency distribution in Redis


The nltk.probability.FreqDist class is used in many classes throughout NLTK for storing and managing frequency distributions. It's quite useful, but it's all in-memory, and doesn't provide a way to persist the data. A single FreqDist is also not accessible to multiple processes. We can change all that by building a FreqDist on top of Redis.

Redis is a data structure server that is one of the more popular NoSQL databases. Among other things, it provides a network-accessible database for storing dictionaries (also known as hash maps). Building a FreqDist interface to a Redis hash map will allow us to create a persistent FreqDist that is accessible to multiple local and remote processes at the same time.

Note

Most Redis operations are atomic, so it's even possible to have multiple processes write to the FreqDist concurrently.

Getting ready

For this and the subsequent recipes, we need to install both Redis and redis-py. The Redis website is at http://redis...