Creating complex strings is, in many ways, the polar opposite of parsing a complex string. We generally find that we'll use a template with substitution rules to put data into a more complex format.
Let's say we have pieces of data that we need to turn into a nicely formatted message. We might have data including the following:
>>> id = "IAD">>> location = "Dulles Intl Airport">>> max_temp = 32>>> min_temp = 13>>> precipitation = 0.4
And we'd like a line that looks like this:
IAD : Dulles Intl Airport : 32 / 13 / 0.40
- Create a template string from the result, replacing all of the data items with
{}
placeholders. Inside each placeholder, put the name of the data item.
'{id} : {location} : {max_temp} / {min_temp} / {precipitation}'
- For each data item, append
:data type
information to the placeholders in the template string. The basic data type codes are:s
for stringd
for decimal numberf
for floating-point number
It would look like this:
'{id:s} : {location:s} : {max_temp:d} / {min_temp:d} / {precipitation:f}'
- Add length information where required. Length is not always required, and in some cases, it's not even desirable. In this example, though, the length information assures that each message has a consistent format. For strings and decimal numbers, prefix the format with the length like this:
19s
or3d
. For floating-point numbers use a two part prefix like this:5.2f
to specify the total length of five characters with two to the right of the decimal point. Here's the whole format:
'{id:3d} : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'
- Use the
format()
method of this string to create the final string:
>>> '{id:3s} : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format(... id=id, location=location, max_temp=max_temp,... min_temp=min_temp, precipitation=precipitation... )'IAD : Dulles Intl Airport : 32 / 13 / 0.40'
We've provided all of the variables by name in the format()
method of the template string. This can get tedious. In some cases, we might want to build a dictionary object with the variables. In that case, we can use the format_map()
method:
>>> data = dict(... id=id, location=location, max_temp=max_temp,... min_temp=min_temp, precipitation=precipitation... )>>> '{id:3s} : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format_map(data)'IAD : Dulles Intl Airport : 32 / 13 / 0.40'
We'll return to dictionaries in Chapter 4,Build-in Data Structures – list, set, dict.
The built-in vars()
function builds a dictionary of all of the local variables for us:
>>> '{id:3s} : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format_map(... vars()... )'IAD : Dulles Intl Airport : 32 / 13 / 0.40'
The vars()
function is very handy for building a dictionary automatically.
The string format()
and format_map()
methods can do a lot of relatively sophisticated string assembly for us.
The basic feature is to interpolate data into a string based on names of keyword arguments or keys in a dictionary. Variables can also be interpolated by position—we can provide position numbers instead of names. We can use a format specification like {0:3s}
to use the first positional argument to format()
.
We've seen three of the formatting conversions—s
, d
, f
—there are many others. Details are in Section 6.1.3 of the Python Standard Library. Here are some of the format conversions we might use:
b
is for binary, base 2.c
is for Unicode character. The value must be a number, which is converted to a character. Often, we use hexadecimal numbers for this so you might want to try values such as0x2661
through0x2666
for fun.d
is for decimal numbers.E
ande
are for scientific notations.6.626E-34
or6.626e-34
depending on which E or e character is used.F
andf
are for floating-point. For not a number thef
format shows lowercasenan
; theF
format shows uppercaseNAN
.G
andg
are for general. This switches automatically betweenE
andF
(ore
andf
,) to keep the output in the given sized field. For a format of20.5G
, up to 20-digit numbers will be displayed usingF
formatting. Larger numbers will useE
formatting.n
is for locale-specific decimal numbers. This will insert,
or.
characters depending on the current locale settings. The default locale may not have a thousand separators defined. For more information, see thelocale
module.o
is for octal, base 8.s
is for string.X
andx
is for hexadecimal, base 16. The digits include uppercaseA-F
and lowercasea-f
, depending on whichX
orx
format character is used.%
is for percentage. The number is multiplied by 100 and includes the%
.
We have a number of prefixes we can use for these different types. The most common one is the length. We might use {name:5d}
to put in a 5-digit number. There are several prefixes for the preceding types:
- Fill and alignment: We can specify a specific filler character (space is the default) and an alignment. Numbers are generally aligned to the right and strings to the left. We can change that using
<
,>
, or^
. This forces left alignment, right alignment, or centering. There's a peculiar=
alignment that's used to put padding after a leading sign. - Sign: The default rule is a leading negative sign where needed. We can use
+
to put a sign on all numbers,-
to put a sign only on negative numbers, and a space to use a space instead of a plus for positive numbers. In scientific output, we must use{value: 5.3f}
. The space makes sure that room is left for the sign, assuring that all the decimal points line up nicely. - Alternate form: We can use the
#
to get an alternate form. We might have something like{0:#x}
,{0:#o}
,{0:#b}
to get a prefix on hexadecimal, octal, or binary values. With a prefix, the numbers will look like0xnnn
,0onnn
, or0bnnn
. The default is to omit the two character prefix. - Leading zero: We can include
0
to get leading zeros to fill in the front of a number. Something like{code:08x
) will produce a hexadecimal value with leading zeroes to pad it out to eight characters. - Width and precision: For integer values and strings, we only provide the width. For floating-point values we often provide
width.precision
.
There are some times when we won't use a {name:format}
specification. Sometimes we'll need to use a {name!conversion}
specification. There are only three conversions available.
{name!r}
shows the representation that would be produced byrepr(name)
{name!s}
shows the string value that would be produced bystr(name)
{name!a}
shows the ASCII value that would be produced byascii(name)
In Chapter 6, Basics of Classes and Objects, we'll leverage the idea of the {name!r}
format specification to simplify displaying information about related objects.
A handy debugging hack this:
print("some_variable={some_variable!r}".format_map(vars()))
The vars()
function—with no arguments—collects all of the local variables into a mapping. We provide that mapping for format_map()
. The format template can use lots of {variable_name!r}
to display details about various objects we have in local variables.
Inside a class definition we can use techniques such as vars(self)
. This looks forward to Chapter 6, Basics of Classes and Objects:
>>> class Summary:... def __init__(self, id, location, min_temp, max_temp, precipitation):... self.id= id... self.location= location... self.min_temp= min_temp... self.max_temp= max_temp... self.precipitation= precipitation... def __str__(self):... return '{id:3s} : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format_map(... vars(self)... )>>> s= Summary('IAD', 'Dulles Intl Airport', 13, 32, 0.4)>>> print(s)IAD : Dulles Intl Airport : 32 / 13 / 0.40
Our class definition includes a __str__()
method. This method relies on vars(self)
to create a useful dictionary of just the attribute of the object.