As you learned in the previous section, there are several major types of variables that can be defined in a myriad of locations. This leads to a very important question: what happens when the same variable name is used in multiple locations? Ansible has a precedence for loading variable data, and thus it has an order and a definition to decide which variable will win. Variable value overriding is an advanced usage of Ansible, so it is important to fully understand the semantics before attempting such a scenario.
Variable precedence
Precedence order
Ansible defines the precedence order as follows, with those closest to the top of the list winning. Note that this can change from release to release, and has changed quite significantly since Ansible 2.4 was released, so it is worth reviewing, especially when upgrading your Ansible environment:
- Extra vars (from the command line) always wins
- include parameters
- Role (and include_role) parameters
- Variables defined with set_facts, and those created with the register task directive
- include_vars
- Task vars (only for the specific task)
- Block vars (only for the tasks within the block)
- Role vars (defined in main.yml in the vars subdirectory of the role).
- Play vars_files
- Play vars_prompt
- Play vars
- Host facts (and also cached set_facts)
- host_vars playbook
- host_vars inventory
- Inventory file (or script) defined host vars
- group_vars playbook
- group_vars inventory
- group_vars/all playbook
- group_vars/all inventory
- Inventory file (or script) defined group vars
- Role defaults
- Command-line values (for example, -u REMOTE_USER)
Variable group priority ordering
The previous list of priority ordering is obviously helpful when writing Ansible playbooks, and, in most cases, it is apparent that variables should not clash. For example, a task var clearly wins over a play var, and all tasks and indeed plays are unique. Similarly, all hosts in the inventory will be unique, so again, there should be no clash of variables with the inventory either.
There is, however, one exception to this – inventory groups. A one-to-many relationship exists between hosts and groups, and, as such, any given host can be a member of one or more groups. Let's suppose that the following code is our inventory file by way of example:
[frontend]
host1.example.com
host2.example.com
[web:children]
frontend
[web:vars]
http_port=80
secure=true
[proxy]
host1.example.com
[proxy:vars]
http_port=8080
thread_count=10
Here, we have two hypothetical frontend servers, host1.example.com and host2.example.com, in the frontend group. Both hosts are children of the web group, which means they are assigned the inventory group_vars http_port=80. host1.example.com is also a member of the proxy group, which has an identically named variable but with a different assignment: http_port=8080.
Both of these variable assignments are at the inventory group_vars level, and so the order of precedence does not define a winner. So what happens in this case?
The answer is, in fact, predictable and deterministic. The group_vars assignments are done in alphabetical order of the group names (Refer to the tip box mentioned in the section Inventory ordering), with the last loaded group overriding all preceding variable values that coincide.
This means any competing variables from mastery2 will win over the other two groups. Those from the mastery11 then take precedence of those from the mastery1 group, so please be mindful of this when creating group names!
In our example, when the groups are processed in alphabetical order, web comes after proxy, and so the group_vars assignments from web that coincide with those from any previously processed groups will win. Let's run the previous inventory file through this example playbook to take a look at the behavior:
---
- name: group variable priority ordering example play
hosts: all
gather_facts: false
tasks:
- name: show assigned group variables
vars:
msg: |
http_port:{{ hostvars[inventory_hostname]['http_port'] }}
thread_count:{{ hostvars[inventory_hostname]['thread_count'] | default("undefined") }}
secure:{{ hostvars[inventory_hostname]['secure'] }}
debug:
msg: "{{ msg.split('\n') }}"
When run, we get the following output:
As expected, the value assigned to the http_port variable for both hosts in the inventory is 80. However, what if this behavior is not desired? Suppose we want the value of http_port from the proxy group to take priority. It would be painful to have to rename the group and all associated references to it to change the alphanumerical sorting of the groups (though this would work!). The good news is that Ansible 2.4 introduced the ansible_group_priority group variable, which can be used for just this eventuality. If not explicitly set, this variable defaults to 1, leaving the rest of the inventory file unchanged.
Let's set this as follows:
[proxy:vars]
http_port=8080
thread_count=10
ansible_group_priority=10
Now, when we run the same playbook, note how the value assigned to http_proxy has changed, whilst all variable names that were not coincidental behave exactly as before:
As your inventory grows with your infrastructure, be sure to make use of this feature to gracefully handle any variable assignment collisions between your groups.
Merging hashes
In the previous section, we focused on the precedence in which variables will override each other. The default behavior of Ansible is that any overriding definition for a variable name will completely mask the previous definition of that variable. However, that behavior can be altered for one type of variable; the hash. A hash variable (a dictionary, in Python terms) is a dataset of keys and values. Values can be of different types for each key, and can even be hashes themselves for complex data structures.
In some advanced scenarios, it is preferable to replace just one bit of a hash or add to an existing hash rather than replacing the hash altogether. To unlock this ability, a configuration change is necessary in the Ansible config file. The configuration entry is hash_behavior, which either takes the value replace or merge. A setting of merge will instruct Ansible to merge or blend the values of two hashes when presented with an override scenario, rather than assume the default of replace, which will completely replace the old variable data with the new data.
Let's walk through an example of the two behaviors. We will start with a hash loaded with data and simulate a scenario where a different value for the hash is provided as a higher-priority variable.
This is the starting data:
hash_var: fred: home: Seattle transport: Bicycle
This is the new data loaded via include_vars:
hash_var: fred: transport: Bus
With the default behavior, the new value for hash_var will be as follows:
hash_var: fred: transport: Bus
However, if we enable the merge behavior, we will get the following result:
hash_var: fred: home: Seattle transport: Bus
There are even more nuances and undefined behaviors when using merge and, as such, it is strongly recommended to only use this setting if absolutely necessary.