Book Image

Python 2.6 Text Processing: Beginners Guide

By : Jeff McNeil
Book Image

Python 2.6 Text Processing: Beginners Guide

By: Jeff McNeil

Overview of this book

<p>For programmers, working with text is not about reading their newspaper on a break; it's about taking textual data in one form and doing something to it. Extract, decrypt, parse, restructure – these are just some of the text tasks that can occupy much of a programmer's life. If this is your life, this book will make it better – a practical guide on how to do what you want with textual data in Python.</p> <p><em>Python 2.6 Text Processing Beginner's Guide</em> is the easiest way to learn how to manipulate text with Python. Packed with examples, it will teach you text processing techniques and give you the skills to work with the most popular Python libraries for transforming text from one form to another.</p> <p>The book gets you going with a quick look at some data formats, and installing the supporting libraries and components so that you're ready to get started. You move on to extracting text from a collection of sources and handling it using Python's built-in string functions and regular expressions. You look into processing structured text documents such as XML and HTML, JSON, and CSV. Then you progress to generating documents and creating templates. Finally you look at ways to enhance text output via a collection of third-party packages such as Nucular, PyParsing, NLTK, and Mako.</p>
Table of Contents (20 chapters)
Python 2.6 Text Processing Beginner's Guide
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Chapter 3: Python String Services


String literals

  1. Yes, it's possible and the use cases are largely the same. There's one caveat, though: you must define such strings using ur'string' and not ru'string'. Remember, a raw string just affects how it is interpreted, whereas a Unicode string generates an entirely new data type.

  2. Strings promote to the "widest" value. For example, Unicode + Unicode is Unicode. At the same time, Unicode + String is Unicode.

  3. The exception would have been handled in the default fashion. Our application would terminate and Python would print a back trace.

String formatting

  1. Essentially, in places where you're printing the same string repeatedly, but with different values in constant places. It's also useful when creating longer strings such as e-mail message content; you can save your template in an external file and access it via Python's file IO mechanisms.

  2. If you have an existing dictionary, you can pass it to a string's format method by prepending it with two asterisks.

    Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
    [GCC 4.2.1 (Apple Inc. build 5646)] on darwin
    Type "help", "copyright", "credits", or "license" for more information.
    >>> d = {'a': 1, 'b': 2}
    >>> '{a}/{b} = Half'.format(**d)
    '1/2 = Half'	
    >>>
  3. The answer in this case is a string representation of 12. The + operator, when applied to strings, results in a concatenation.