Book Image

Google Cloud for DevOps Engineers

By : Sandeep Madamanchi

Book Image

Google Cloud for DevOps Engineers

By: Sandeep Madamanchi

Overview of this book

DevOps is a set of practices that help remove barriers between developers and system administrators, and is implemented by Google through site reliability engineering (SRE). With the help of this book, you'll explore the evolution of DevOps and SRE, before delving into SRE technical practices such as SLA, SLO, SLI, and error budgets that are critical to building reliable software faster and balance new feature deployment with system reliability. You'll then explore SRE cultural practices such as incident management and being on-call, and learn the building blocks to form SRE teams. The second part of the book focuses on Google Cloud services to implement DevOps via continuous integration and continuous delivery (CI/CD). You'll learn how to add source code via Cloud Source Repositories, build code to create deployment artifacts via Cloud Build, and push it to Container Registry. Moving on, you'll understand the need for container orchestration via Kubernetes, comprehend Kubernetes essentials, apply via Google Kubernetes Engine (GKE), and secure the GKE cluster. Finally, you'll explore Cloud Operations to monitor, alert, debug, trace, and profile deployed applications. By the end of this SRE book, you'll be well-versed with the key concepts necessary for gaining Professional Cloud DevOps Engineer certification with the help of mock tests.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Section 1: Site Reliability Engineering – A Prescriptive Way to Implement DevOps

Section 1: Site Reliability Engineering – A Prescriptive Way to Implement DevOps

Free Chapter

Chapter 1: DevOps, SRE, and Google Cloud Services for CI/CD

Chapter 1: DevOps, SRE, and Google Cloud Services for CI/CD

Understanding DevOps, its evolution, and life cycle

SRE's evolution; technical and cultural practices

Cloud-native approach to implementing DevOps using Google Cloud

Points to remember

Further reading

Chapter 2: SRE Technical Practices – Deep Dive

Chapter 2: SRE Technical Practices – Deep Dive

Defining reliability expectations via SLOs

Understanding error budgets

Eliminating toil through automation

Illustrating the impact of SLAs, SLOs, and error budgets relative to SLI

Points to remember

Further reading

Chapter 3: Understanding Monitoring and Alerting to Target Reliability

Chapter 3: Understanding Monitoring and Alerting to Target Reliability

Understanding monitoring

Points to remember

Further reading

Chapter 4: Building SRE Teams and Applying Cultural Practices

Chapter 4: Building SRE Teams and Applying Cultural Practices

Building SRE teams

Incident management

Psychological safety

Sharing vision and knowledge and fostering collaboration

Points to remember

Further reading

Section 2: Google Cloud Services to Implement DevOps via CI/CD

Section 2: Google Cloud Services to Implement DevOps via CI/CD

Chapter 5: Managing Source Code Using Cloud Source Repositories

Chapter 5: Managing Source Code Using Cloud Source Repositories

Technical requirements

Introducing the key features

One-way sync from GitHub/Bitbucket to CSR

Common operations in CSR

Hands-on lab – integrating with Cloud Functions

Further reading

Chapter 6: Building Code Using Cloud Build, and Pushing to Container Registry

Chapter 6: Building Code Using Cloud Build, and Pushing to Container Registry

Technical requirements

Key terminology (prerequisites)

Understanding the need for automation

Building and creating container images – Cloud Build

Managing build artifacts – Container Registry

Hands-on lab – building, creating, pushing, and deploying a container to Cloud Run using Cloud Build triggers

Points to remember

Further reading

Chapter 7: Understanding Kubernetes Essentials to Deploy Containerized Applications

Chapter 7: Understanding Kubernetes Essentials to Deploy Containerized Applications

Technical requirements

Kubernetes – a quick introduction

Kubernetes cluster anatomy

Kubernetes objects

Scheduling and interacting with Pods

Kubernetes deployment strategies

Points to remember

Further reading

Chapter 8: Understanding GKE Essentials to Deploy Containerized Applications

Chapter 8: Understanding GKE Essentials to Deploy Containerized Applications

Technical requirements

Google Kubernetes Engine (GKE) – introduction

GKE – core features

GKE Autopilot – hands-on lab

Points to remember

Further reading

Chapter 9: Securing the Cluster Using GKE Security Constructs

Chapter 9: Securing the Cluster Using GKE Security Constructs

Technical requirements

Essential security patterns in Kubernetes

Hardening cluster security in GKE

Points to remember

Further reading

Chapter 10: Exploring GCP Cloud Operations

Chapter 10: Exploring GCP Cloud Operations

Cloud Monitoring

Binding SRE and Cloud Operations

Points to remember

Further reading

Mock Exam 1

Total Number of Questions: 50

Mock Exam 2

Total Number of Questions: 50

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Leave a review - let other readers know what you think

Appendix: Getting Ready for Professional Cloud DevOps Engineer Certification

Appendix: Getting Ready for Professional Cloud DevOps Engineer Certification

Cloud Deployment Manager

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Alerting

SLIs are quantitative measurements at a given point in time and SLOs use SLIs to reflect the reliability of the system. SLIs are captured or represented in the form of metrics. Monitoring systems monitor these metrics against a specific set of policies. These policies represent the target SLOs over a period and are referred to as alerting rules.

Alerting is the process of processing the alerting rules, which track the SLOs and notify or perform certain actions when the rules are violated. In other words, alerting allows the conversion of SLOs into actionable alerts on significant events. Alerts can then be sent to an external application or a ticketing system or a person.

Common scenarios for triggering alerts include (and are not limited to) the following:

The service or system is down.
SLOs or SLAs are not met.
Immediate human intervention is required to change something.

As discussed previously, SLOs represent an achievable target, and error...