What makes a good runbook – the basics
Okay, I’ll admit it – I’m personally not a fan of runbooks. I just haven’t found a better tool to make the knowledge I need available to me instantaneously during an outage. For me, it’s a struggle between wanting to be the type of engineer who can walk into an outage and in minutes resolve the issue and knowing the truth – I’m not always as good as I think I am or want to be. And that is the most honest reason why we need runbooks.
One of the strong characteristics most rockstar engineers have is the ability to solve problems and find a path to resolution quickly – whether in an obscure compiler error or the highest priority outages. We leverage this power to quickly determine cause and remediations, but imagine if we had the article with the answer bookmarked – how much time would that save? That is the strength of the runbook – it’s a singular answer to...