Introducing automated chaos
Running manual Gameday exercises is a great way to introduce the practice of failure injection. Forcing failures in production helps build confidence in the resilience of systems and identifies opportunities for improvement. Gameday helps teams gain a better overall understanding of how their systems behave when confronted with a number of failure scenarios. As a team conducts more exercises, it will start to accumulate tools for performing common tasks, such as introducing latency in the network or spiking CPU usage. Tooling helps automate mundane tasks, improving the efficiency of Gameday exercises. There are a variety of open source and commercial tools designed to automate chaos engineering that teams can take advantage of right away.
Gameday exercises are planned and scheduled. Some organizations go one step further and introduce continuous failure injection as a way of ensuring that systems are handling common failure scenarios smoothly. In early 2011, Netflix...