Building complex strings with f-strings
Creating complex strings is, in many ways, the polar opposite of parsing a complex string. We generally find that we use a template with substitution rules to put data into a more complex format.
Getting ready
Let's say we have pieces of data that we need to turn into a nicely formatted message. We might have data that includes the following:
>>> id = "IAD"
>>> location = "Dulles Intl Airport"
>>> max_temp = 32
>>> min_temp = 13
>>> precipitation = 0.4
And we'd like a line that looks like this:
IAD : Dulles Intl Airport : 32 / 13 / 0.40
How to do it...
- Create an
f-string
from the result, replacing all of the data items with{}
placeholders. Inside each placeholder, put a variable name (or an expression.) Note that the string uses the prefix off'
. Thef
prefix creates a sophisticated string object where values are interpolated into the template when the string is used:f'{id} : {location} : {max_temp} / {min_temp} / {precipitation}'
- For each name or expression, an optional
:data type
can be appended to the names in the template string. The basic data type codes are:s
for stringd
for decimal numberf
for floating-point numberIt would look like this:
f'{id:s} : {location:s} : {max_temp:d} / {min_temp:d} / {precipitation:f}'
- Add length information where required. Length is not always required, and in some cases, it's not even desirable. In this example, though, the length information ensures that each message has a consistent format. For strings and decimal numbers, prefix the format with the length like this:
19s
or3d
. For floating-point numbers, use a two-part prefix like5.2f
to specify the total length of five characters, with two to the right of the decimal point. Here's the whole format:>>> f'{id:3d} : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}' 'IAD : Dulles Intl Airport : 32 / 13 / 0.40'
How it works...
f-strings can do a lot of relatively sophisticated string assembly by interpolating data into a template. There are a number of conversions available.
We've seen three of the formatting conversions—s
, d
, f
—but there are many others. Details can be found in the Formatted string literals section of the Python Standard Library: https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals.
Here are some of the format conversions we might use:
b
is for binary, base 2.c
is for Unicode character. The value must be a number, which is converted into a character. Often, we use hexadecimal numbers for these characters, so you might want to try values such as0x2661
through0x2666
to see interesting Unicode glyphs.d
is for decimal numbers.E
ande
are for scientific notations.6.626E-34
or6.626e-34
, depending on whichE
ore
character is used.F
andf
are for floating-point. For not a number, thef
format shows lowercasenan
; theF
format shows uppercaseNAN
.G
andg
are for general use. This switches automatically betweenE
andF
(ore
andf
) to keep the output in the given sized field. For a format of20.5G
, up to 20-digit numbers will be displayed usingF
formatting. Larger numbers will useE
formatting.n
is for locale-specific decimal numbers. This will insert,
or.
characters, depending on the current locale settings. The default locale may not have 1,000 separators defined. For more information, see thelocale
module.o
is for octal, base 8.s
is for string.X
andx
are for hexadecimal, base 16. The digits include uppercaseA-F
and lowercasea-f
, depending on whichX
orx
format character is used.%
is for percentage. The number is multiplied by 100 and includes the%
.
We have a number of prefixes we can use for these different types. The most common one is the length. We might use {name:5d}
to put in a 5-digit number. There are several prefixes for the preceding types:
- Fill and alignment: We can specify a specific filler character (space is the default) and an alignment. Numbers are generally aligned to the right and strings to the left. We can change that using
<
,>
, or^
. This forces left alignment, right alignment, or centering, respectively. There's a peculiar = alignment that's used to put padding after a leading sign. - Sign: The default rule is a leading negative sign where needed. We can use
+
to put a sign on all numbers,-
to put a sign only on negative numbers, and a space to use a space instead of a plus for positive numbers. In scientific output, we often use{value: 5.3f}
. The space makes sure that room is left for the sign, ensuring that all the decimal points line up nicely. - Alternate form: We can use the
#
to get an alternate form. We might have something like{0:#x}
,{0:#o}
, or{0:#b}
to get a prefix on hexadecimal, octal, or binary values. With a prefix, the numbers will look like0xnnn
,0onnn
, or0bnnn
. The default is to omit the two-character prefix. - Leading zero: We can include
0
to get leading zeros to fill in the front of a number. Something like{code:08x}
will produce a hexadecimal value with leading zeroes to pad it out to eight characters. - Width and precision: For integer values and strings, we only provide the width. For floating-point values, we often provide
width.precision
.
There are some times when we won't use a {name:format}
specification. Sometimes, we'll need to use a {name!conversion}
specification. There are only three conversions available:
{name!r}
shows the representation that would be produced byrepr(name)
.{name!s}
shows the string value that would be produced bystr(name)
; this is the default behavior if you don't specify any conversion. Using!s
explicitly lets you add string-type format specifiers.{name!a}
shows the ASCII value that would be produced byascii(name)
.- Additionally, there's a handy debugging format specifier available in Python 3.8. We can include a trailing equals sign,
=
, to get a handy dump of a variable or expression. The following example uses both forms:>>> value = 2**12-1 >>> f'{value=} {2**7+1=}' 'value=4095 2**7+1=129'
The f-string
showed the value of the variable named value
and the result of an expression, 2**7+1
.
In Chapter 7, Basics of Classes and Objects, we'll leverage the idea of the {name!r}
format specification to simplify displaying information about related objects.
There's more...
The f-string
processing relies on the string format()
method. We can leverage this method and the related format_map()
method for cases where we have more complex data structures.
Looking forward to Chapter 4, Built-In Data Structures Part 1: Lists and Sets, we might have a dictionary where the keys are simple strings that fit with the format_map()
rules:
>>> data = dict(
... id=id, location=location, max_temp=max_temp,
... min_temp=min_temp, precipitation=precipitation
... )
>>> '{id:3s} : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format_map(data)
'IAD : Dulles Intl Airport : 32 / 13 / 0.40'
We've created a dictionary
object, data
, that contains a number of values with keys
that are valid Python identifiers: id
, location
, max_temp
, min_temp
, and precipitation
. We can then use this dictionary with format_map()
to extract values from the dictionary using the keys
.
Note that the formatting template here is not an f-string
. It doesn't have the f"
prefix. Instead of using the automatic formatting features of an f-string
, we've done the interpolation "the hard way" using the format_map()
method.
See also
- More details can be found in the Formatted string literals section of the Python Standard Library: https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals