Book Image

Real-World SRE

By : Nat Welch
Book Image

Real-World SRE

By: Nat Welch

Overview of this book

Real-World SRE is the go-to survival guide for the software developer in the middle of catastrophic website failure. Site Reliability Engineering (SRE) has emerged on the frontline as businesses strive to maximize uptime. This book is a step-by-step framework to follow when your website is down and the countdown is on to fix it. Nat Welch has battle-hardened experience in reliability engineering at some of the biggest outage-sensitive companies on the internet. Arm yourself with his tried-and-tested methods for monitoring modern web services, setting up alerts, and evaluating your incident response. Real-World SRE goes beyond just reacting to disaster—uncover the tools and strategies needed to safely test and release software, plan for long-term growth, and foresee future bottlenecks. Real-World SRE gives you the capability to set up your own robust plan of action to see you through a company-wide website crisis. The final chapter of Real-World SRE is dedicated to acing SRE interviews, either in getting a first job or a valued promotion.
Table of Contents (13 chapters)

Documenting and maintaining projects

A last but very important detail is the documentation and maintenance of the code that you produce. Hopefully, your code will last as long as it is needed. Setting good standards early on for documentation promotes others to write documents in the future. One way to promote this is to tie documentation changes to code reviews, so your reviewer can see not only the internals changing but also the messaging to users changing.

Another thing that you can do is to publish a change log. A change log is a document showing what has changed since the release. If you release often, you will probably want to automate this in some way. More often than not, this document will be for other developers or product people if it is automated. Many organizations I have worked for keep an internal change log for each release and then the product team will translate that into a simpler document for external consumption by executives and customers.

Figuring out the right...