In its default mode of operation, Ansible continues to execute a play on a batch of servers (the batch size is determined by the serial directive we discussed in the preceding section) as long as there are hosts in the inventory and a failure isn't recorded. Obviously, in a highly available or load-balanced environment (such as the one we discussed previously), this is not ideal. If there is a bug in your play, or perhaps a problem with the code being rolled out, the last thing that you want is for Ansible to faithfully roll it out to all servers in the cluster, causing a service outage because all the nodes suffered a failed upgrade. It would be far better, in this kind of environment, to fail early on and leave at least some hosts in the cluster untouched until someone can intervene and resolve the issue.
For our practical example...