Book Image

Real-World SRE

By : Nat Welch
Book Image

Real-World SRE

By: Nat Welch

Overview of this book

Real-World SRE is the go-to survival guide for the software developer in the middle of catastrophic website failure. Site Reliability Engineering (SRE) has emerged on the frontline as businesses strive to maximize uptime. This book is a step-by-step framework to follow when your website is down and the countdown is on to fix it. Nat Welch has battle-hardened experience in reliability engineering at some of the biggest outage-sensitive companies on the internet. Arm yourself with his tried-and-tested methods for monitoring modern web services, setting up alerts, and evaluating your incident response. Real-World SRE goes beyond just reacting to disaster—uncover the tools and strategies needed to safely test and release software, plan for long-term growth, and foresee future bottlenecks. Real-World SRE gives you the capability to set up your own robust plan of action to see you through a company-wide website crisis. The final chapter of Real-World SRE is dedicated to acing SRE interviews, either in getting a first job or a valued promotion.
Table of Contents (13 chapters)


Security is essential to gaining the trust of users. In our modern age, data breaches are very common and information security is very important. Users are starting to select products based on their security and privacy. Security is a huge field on its own, with many companies having their own security engineers and entire companies devoted to working as consultants to examine other companies' security practices. Security is the responsibility of everyone, just like monitoring, incident response, or UX. You should have experts on your staff to direct you, but you must not be ignorant about the topic and defer to your experts for everything.

When building software, or auditing technology for security matters, try thinking about the three following aspects of a piece of technology:

  • Authentication: How do you give a user a token saying they can interact with things? For example, a user logging in with a username and password.
  • Authorization: How do you determine whether...