Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By : Jacob Perkins
Book Image

Python 3 Text Processing with NLTK 3 Cookbook

By: Jacob Perkins

Overview of this book

Table of Contents (17 chapters)
Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Penn Treebank Part-of-speech Tags
Index

Spelling correction with Enchant


Replacing repeating characters is actually an extreme form of spelling correction. In this recipe, we will take on the less extreme case of correcting minor spelling issues using Enchant—a spelling correction API.

Getting ready

You will need to install Enchant and a dictionary for it to use. Enchant is an offshoot of the AbiWord open source word processor, and more information on it can be found at http://www.abisource.com/projects/enchant/.

For dictionaries, Aspell is a good open source spellchecker and dictionary that can be found at http://aspell.net/.

Finally, you will need the PyEnchant library, which can be found at the following link: http://pythonhosted.org/pyenchant/

You should be able to install it with the easy_install command that comes with Python setuptools, such as by typing sudo easy_install pyenchant on Linux or Unix. On a Mac machine, PyEnchant may be difficult to install. If you have difficulties, consult http://pythonhosted.org/pyenchant/download...