Book Image

Python 2.6 Text Processing: Beginners Guide

By : Jeff McNeil
Book Image

Python 2.6 Text Processing: Beginners Guide

By: Jeff McNeil

Overview of this book

<p>For programmers, working with text is not about reading their newspaper on a break; it's about taking textual data in one form and doing something to it. Extract, decrypt, parse, restructure – these are just some of the text tasks that can occupy much of a programmer's life. If this is your life, this book will make it better – a practical guide on how to do what you want with textual data in Python.</p> <p><em>Python 2.6 Text Processing Beginner's Guide</em> is the easiest way to learn how to manipulate text with Python. Packed with examples, it will teach you text processing techniques and give you the skills to work with the most popular Python libraries for transforming text from one form to another.</p> <p>The book gets you going with a quick look at some data formats, and installing the supporting libraries and components so that you're ready to get started. You move on to extracting text from a collection of sources and handling it using Python's built-in string functions and regular expressions. You look into processing structured text documents such as XML and HTML, JSON, and CSV. Then you progress to generating documents and creating templates. Finally you look at ways to enhance text output via a collection of third-party packages such as Nucular, PyParsing, NLTK, and Mako.</p>
Table of Contents (20 chapters)
Python 2.6 Text Processing Beginner's Guide
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Time for action – installing SetupTools


Egg files have largely become the de facto standard in Python packaging. In order to install, develop, and build egg files, it is necessary to install a third-party tool kit. The most popular is SetupTools , and this is what we'll be working with throughout this book. The installation process is fairly easy to complete and is rather self-contained. Installing SetupTools gives us access to the easy_install command, which automates the download and installation of packages that have been registered with PyPI.

  1. Download the installation script, which is available at http://peak.telecommunity.com/dist/ez_setup.py. This same script will be used for all versions of Python.

  2. As an administrative user, run the ez_setup.py script from the command line. The SetupTools installation process will complete. If you've executed the script with the proper rights, you should see output similar as follows:

    # python ez_setup.py 
    Downloading http://pypi.python.org/packages/2.6/s/setuptools/setuptools-0.6c11-py2.6.egg
    Processing setuptools-0.6c11-py2.6.egg
    creating /usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg
    Extracting setuptools-0.6c11-py2.6.egg to /usr/lib/python2.6/site-packages
    Adding setuptools 0.6c11 to easy-install.pth file
    Installing easy_install script to /usr/bin
    Installing easy_install-2.6 script to /usr/bin
    
    Installed /usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg
    Processing dependencies for setuptools==0.6c11
    Finished processing dependencies for setuptools==0.6c11
    #
    

What just happened?

We downloaded the SetupTools installation script and executed it as an administrative user. By doing so, our system Python environment was configured so that we can install egg files in the future via the SetupTools easy_install system.

Note

SetupTools does not currently work with Python 3.0. There is, however, an alternative available via the Distribute project. Distribute is intended to be a drop-in replacement for SetupTools and will work with either major Python version. For more information, or to download the installer, visit http://pypi.python.org/pypi/distribute.