Book Image

Modern Python Cookbook

Book Image

Modern Python Cookbook

Overview of this book

Python is the preferred choice of developers, engineers, data scientists, and hobbyists everywhere. It is a great scripting language that can power your applications and provide great speed, safety, and scalability. By exposing Python as a series of simple recipes, you can gain insight into specific language features in a particular context. Having a tangible context helps make the language or standard library feature easier to understand. This book comes with over 100 recipes on the latest version of Python. The recipes will benefit everyone ranging from beginner to an expert. The book is broken down into 13 chapters that build from simple language concepts to more complex applications of the language. The recipes will touch upon all the necessary Python concepts related to data structures, OOP, functional programming, as well as statistical programming. You will get acquainted with the nuances of Python syntax and how to effectively use the advantages that it offers. You will end the book equipped with the knowledge of testing, web services, and configuration and application integration tips and tricks. The recipes take a problem-solution approach to resolve issues commonly faced by Python programmers across the globe. You will be armed with the knowledge of creating applications with flexible logging, powerful configuration, and command-line options, automated unit tests, and good documentation.
Table of Contents (18 chapters)
Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Building complex strings from lists of characters


How can we make very complex changes to an immutable string? Can we assemble a string from individual characters?

In most cases, the recipes we've already seen give us a number of tools for creating and modifying strings. There are yet more ways in which we can tackle the string manipulation problem. We'll look at using a list object. This will dovetail with some of the recipes in Chapter 4, Built-in Data Structures – list, set, dict.

Getting ready

Here's a string that we'd like to rearrange:

>>> title = "Recipe 5: Rewriting an Immutable String"

We'd like to do two transformations:

  • Remove the part before the :
  • Replace the punctuation with _, and make all the characters lowercase

We'll make use of the string module:

>>> from string import whitespace, punctuation

This has two important constants:

  • string.whitespace lists all of the common whitespace characters, including space and tab
  • string.punctuation lists the common ASCII punctuation marks. Unicode has a larger list of punctuation marks; that's also available based on your locale settings

How to do it...

We can work with a string exploded into a list. We'll look at lists in more depth in Chapter 4, Built-in Data Structures – list, set, dict.

  1. Explode the string into a list object:
>>> title_list = list(title)
  1. Find the partition character. The index() method for a list has the same semantics as the index() method for a list. It locates the position with the given value:
>>> colon_position = title_list.index(':')
  1. Delete the characters no longer needed. The del statement can remove items from a list. Lists are a mutable data structures:
>>> del title_list[:colon_position+1]

We don't need to carefully work with the useful piece of the original string. We can remove items from a list.

  1. Replace punctuation by stepping through each position. In this case, we'll use a for statement to visit every index in the string:
>>> for position in range(len(title_list)):...    if title_list[position] in whitespace+punctuation:...        title_list[position]= '_'
  1. The expression range(len(title_list)) generates all of the values between 0 and len(title_list)-1. This assures us that the value of position will be each value index in the list. Join the list of characters to create a new string. It seems a little odd to use zero-length string, '', as a separator when concatenating strings together. However, it works perfectly:
>>> title = ''.join(title_list)>>> title'_Rewriting_an_Immutable_String'

We assigned the resulting string back to the original variable. The original string object, which had been referred to by that variable, is no longer needed: it's removed from memory. The new string object replaces the value of the variable.

How it works...

This is a change in representation trick. Since a string is immutable, we can't update it. We can, however, convert it into a mutable form; in this case, a list. We can do whatever changes are required to the mutable list object. When we're done, we can change the representation from a list back to a string.

Strings provide a number of features that lists don't. Conversely, strings provide a number of features a list doesn't have. We can't convert a list to lowercase the way we can convert a string.

There's an important trade-off here:

  • Strings are immutable, that makes them very fast. Strings are focused on Unicode characters. When we look at mappings and sets, we can use strings as keys for mappings and items in sets because the value is immutable.
  • Lists are mutable. Operations are slower. Lists can hold any kind of item. We can't use a list as a key for a mapping or an item in a set because the value could change.

Strings and lists are both specialized kinds of sequences. Consequently, they have a number of common features. The basic item indexing and slicing features are shared. Similarly a list uses the same kind of negative index values that a string does: list[-1] is the last item in a list object.

We'll return to mutable data structures in Chapter 4, Built-in Data Structures – list, set, dict.

There's more

Once we've started working with a list of characters instead of a string, we no longer have the string processing methods. We do have a number of list-processing techniques available to us. In addition to being able to delete items from a list, we can append an item, extend a list with another list, and insert a character into the list.

We can also change our viewpoint slightly, and look at a list of strings instead of a list of characters. The technique of doing ''.join(list) will work when we have a list of strings as well as a list of characters. For example, we might do this:

>>> title_list.insert(0, 'prefix')>>> ''.join(title_list)'prefix_Rewriting_an_Immutable_String'

Our title_list object will be mutated into a list that contains a six-character string, prefix, plus 30 individual characters.

See also

  • We can also work with strings using the internal methods of a string. See the Rewriting an immutable string recipe for more techniques.
  • Sometimes, we need to build a string, and then convert it into bytes. See the Encoding strings – creating ASCII and UTF-8 bytes recipe for how we can do this.
  • Other times, we'll need to convert bytes into a string. See the Decoding Bytes - How to get proper characters from some bytes recipe.