When working with configuration values, it's common to look them up in multiple places—maybe we load them from a configuration file—but we can override them with an environment variable or a command-line option, and in case the option is not provided, we can have a default value.
This can easily lead to long chains of if
statements like these:
value = command_line_options.get('optname') if value is None: value = os.environ.get('optname') if value is None: value = config_file_options.get('optname') if value is None: value = 'default-value'
This is annoying, and while for a single value it might be just annoying, it will tend to grow into a huge, confusing list of conditions as more options get added.
Command-line options are a very frequent use case, but the problem is related to chained scopes resolution. Variables in Python are resolved by looking at locals()
; if they are not found, the interpreter looks at globals()
, and if they are not yet found, it looks for built-ins.
For this step, you need to go through the following steps:
- The alternative for chaining default values of
dict.get
, instead of using multipleif
instances, probably wouldn't improve much the code and if we want to add one additional scope, we would have to add it in every single place where we are looking up the values. collections.ChainMap
is a very convenient solution to this problem; we can provide a list of mapping containers and it will look for a key through them all.
- Our previous example involving multiple different
if
instances can be converted to something like this:
import os from collections import ChainMap options = ChainMap(command_line_options, os.environ, config_file_options) value = options.get('optname', 'default-value')
- We can also get rid of the last
.get
call by combiningChainMap
withdefaultdict
. In this case, we can usedefaultdict
to provide a default value for every key:
import os from collections import ChainMap, defaultdict options = ChainMap(command_line_options, os.environ, config_file_options, defaultdict(lambda: 'default-value')) value = options['optname'] value2 = options['other-option']
- Print
value
andvalue2
will result in the following:
optvalue
default-value
optname
will be retrieved from the command_line_options
containing it, while other-option
will end up being resolved by defaultdict
.
The ChainMap
class receives multiple dictionaries as arguments; whenever a key is requested to ChainMap
, it's actually going through the provided dictionaries one by one to check whether the key is available in any of them. Once the key is found, it is returned, as if it was a key owned by ChainMap
itself.
The default value for options that are not provided is implemented by having defaultdict
as the last dictionary provided to ChainMap
. Whenever a key is not found in any of the previous dictionaries, it gets looked up in defaultdict
, which uses the provided factory function to return a default value for all keys.
Another great feature of ChainMap
is that it allows updating too, but instead of updating the dictionary where it found the key, it always updates the first dictionary. The result is the same, as on next lookup of that key, we would have the first dictionary override any other value for that key (as it's the first place where the key is checked). The advantage is that if we provide an empty dictionary as the first mapping provided to ChainMap
, we can change those values without touching the original container:
>>> population=dict(italy=60, japan=127, uk=65) >>> changes = dict() >>> editablepop = ChainMap(changes, population) >>> print(editablepop['japan']) 127 >>> editablepop['japan'] += 1 >>> print(editablepop['japan']) 128
But even though we changed the population of Japan to 128 million, the original population didn't change:
>>> print(population['japan'])
127
And we can even use changes
to find out which values were changed and which values were not:
>>> print(changes.keys())
dict_keys(['japan'])
>>> print(population.keys() - changes.keys())
{'italy', 'uk'}
It's important to know, by the way, that if the object contained in the dictionary is mutable and we directly mutate it, there is little ChainMap
can do to avoid mutating the original object. So if, instead of numbers, we store lists in the dictionaries, we will be mutating the original dictionary whenever we append values to the dictionary:
>>> citizens = dict(torino=['Alessandro'], amsterdam=['Bert'], raleigh=['Joseph']) >>> changes = dict() >>> editablecits = ChainMap(changes, citizens) >>> editablecits['torino'].append('Simone') >>> print(editablecits['torino']) ['Alessandro', 'Simone'] >>> print(changes) {} >>> print(citizens) {'amsterdam': ['Bert'], 'torino': ['Alessandro', 'Simone'], 'raleigh': ['Joseph']}