Book Image

Becoming a Rockstar SRE

By : Jeremy Proffitt, Rod Anami
Book Image

Becoming a Rockstar SRE

By: Jeremy Proffitt, Rod Anami

Overview of this book

Site reliability engineering is all about continuous improvement, finding the balance between business and product demands while working within technological limitations to drive higher revenue. But quantifying and understanding reliability, handling resources, and meeting developer requirements can sometimes be overwhelming. With a focus on reliability from an infrastructure and coding perspective, Becoming a Rockstar SRE brings forth the site reliability engineer (SRE) persona using real-world examples. This book will acquaint you the role of an SRE, followed by the why and how of site reliability engineering. It walks you through the jobs of an SRE, from the automation of CI/CD pipelines and reducing toil to reliability best practices. You’ll learn what creates bad code and how to circumvent it with reliable design and patterns. The book also guides you through interacting and negotiating with businesses and vendors on various technical matters and exploring observability, outages, and why and how to craft an excellent runbook. Finally, you’ll learn how to elevate your site reliability engineering career, including certifications and interview tips and questions. By the end of this book, you’ll be able to identify and measure reliability, reduce downtime, troubleshoot outages, and enhance productivity to become a true rockstar SRE!
Table of Contents (27 chapters)
1
Part 1 - Understanding the Basics of Who, What, and Why
5
Part 2 - Implementing Observability for Site Reliability Engineering
10
Part 3 - Applying Architecture for Reliability
16
Part 4 - Mastering the Outage Moments
20
Part 5 - Looking into Future Trends and Preparing for SRE Interviews

People that inspire

We want to finalize this chapter by pointing out other SREs that have inspired us and have been encouraging the wider community. We couldn’t even think about starting this book without the work of the parents of site reliability engineering at Google. We are immensely grateful to them. Site reliability engineering would probably not exist outside Google if they had chosen not to share their thoughts, principles, techniques, and practices through the site reliability engineering foundation books. They are mandatory reading for anyone following this career path. If you haven’t read them yet, please check out Google’s site reliability engineering books at this site: https://sre.google/books/.

We want to recognize a few other rockstar SREs that have really made a difference in our professional lives as individuals. They are trailblazers of site reliability engineering outside Google.

Jeremy’s recognition – Paul Tyma, former CTO, LendingTree

In technology, finding your way can be difficult. The constant struggle of being an SRE leads us into discussions of what went wrong; often, we have to say what some don’t want to hear – that a negative thing happened due to what a person or team did or didn’t do. We are, in fact, often the bearers of bad news. Paul opened the door for me to become an SRE, and we drove a great reliability revolution together. Most importantly, he taught me that there is a balance to all things, and we have a choice in that balance. And what we often consider a responsibility or duty can have its limits.

Rod’s recognition – Ingo Averdunk, Distinguished Engineer, IBM, and Gene Brown, Distinguished Engineer, Kyndryl

Ingo and Gene triggered a small revolution inside IBM by designing and deploying site reliability engineering principles, practices, professions, and methodologies to its organizations across the globe. They first transformed many internal teams to adopt such extraordinary tenets, then later, they helped external customers in doing the same. Of course, they didn’t accomplish this alone, but they were (and are) paramount examples of technical executive leadership. They shaped the site reliability engineering profession from within IBM, which later spread to Kyndryl after its spin-off.