Book Image

Practical Site Reliability Engineering

By : Pethuru Raj Chelliah, Shreyash Naithani, Shailender Singh
Book Image

Practical Site Reliability Engineering

By: Pethuru Raj Chelliah, Shreyash Naithani, Shailender Singh

Overview of this book

Site reliability engineering (SRE) is being touted as the most competent paradigm in establishing and ensuring next-generation high-quality software solutions. This book starts by introducing you to the SRE paradigm and covers the need for highly reliable IT platforms and infrastructures. As you make your way through the next set of chapters, you will learn to develop microservices using Spring Boot and make use of RESTful frameworks. You will also learn about GitHub for deployment, containerization, and Docker containers. Practical Site Reliability Engineering teaches you to set up and sustain containerized cloud environments, and also covers architectural and design patterns and reliability implementation techniques such as reactive programming, and languages such as Ballerina and Rust. In the concluding chapters, you will get well-versed with service mesh solutions such as Istio and Linkerd, and understand service resilience test practices, API gateways, and edge/fog computing. By the end of this book, you will have gained experience on working with SRE concepts and be able to deliver highly reliable apps and services.
Table of Contents (19 chapters)
Title Page
Dedication
About Packt
Contributors
Preface
10
Containers, Kubernetes, and Istio Monitoring
Index

Plunging into the SRE discipline


We have understood the requirements and the challenges. The following sections describe how the SRE field is used to bridge the gap between supply and demand. As explained previously, building software applications through configuration, customization, and composition (orchestration and choreography) is progressing quickly. Speedier programming of software applications using agile programming methods is another incredible aspect of software building. The various DevOps tools from product and tool vendors quietly ensures continuous software integration, delivery, and deployment. 

The business landscape is continuously evolving, and consequently the IT domain has to respond precisely and perfectly to the changing equations and expectations of the business houses. Businesses have to be extremely agile, adaptive, and reliable in their operations, offerings, and outputs. Business automation, acceleration, and augmentation are being solely provided by the various noteworthy improvements and improvisations in the IT domain.

IT agility and reliability directly guarantees the business agility and reliability. As seen previously, the goal of IT agility (software design, development, and deployment) is getting fulfilled through newer techniques. Nowadays, IT experts are looking out for ways and means for significantly enhancing IT reliability goals. Typically, IT reliability equals IT elasticity and resiliency. Let's us refer to the following bullets:

  • IT elasticity: When an IT system is suddenly under a heavy load, how does the IT system provision and use additional IT resources to take care of extra loads without affecting users? IT systems are supposed to be highly elastic to be right and relevant for the future of businesses. Furthermore, not only IT systems but also the business applications and the IT platforms (development, deployment, integration, orchestration, brokerage, and so on) have to be scalable. Thus, the combination of applications, platforms, and infrastructures have to contribute innately to be scalable (vertically, as well as horizontally). 
  • IT resiliency: When an IT system is under attack from internal as well as external sources, the system has to have the wherewithal to wriggle out of that situation to continuously deliver its obligations to its subscribers without any slowdown and breakdown. IT systems have to be highly fault-tolerant to be useful for mission-critical businesses. IT systems have to come back to their original situation automatically, even if they are made to deviate from their prescribed path. Thus, error prediction, identification, isolation, and other capabilities have to be embedded into IT systems. Security and safety issues also have to be dexterously detected and contained to come out unscathed.  

Thus, when IT systems are resilient and elastic, they are termed reliable systems. When IT is reliable, then the IT-enabled businesses can be reliable in their deals, deeds, and decisions that, in turn, enthuse and enlighten their customers, employees, partners, and end users. 

The challenges ahead

The following are some challenges you may come across:

  • Bringing forth a bevy of complexity mitigation techniques. The formula is heterogeneity + multiplicity = complexity. IT (software and infrastructure) complexity is constantly improving.
  • Producing software packages that are fully compliant to various NFRs and QoS /QoE attributes, such as scalability, availability, stability, reliability, extensibility, accessibility, simplicity, performance/throughput, and so on.
  • Performing automated IT infrastructure provisioning, scheduling, configuration, monitoring, measurement, management, and governance.
  • Providing VM and container placement, serverless computing/Function as a Service (FaaS), workload consolidation, energy efficiency, task and job scheduling, resource allocation and usage optimization, service composition for multi-container applications, horizontal scalability, and Resource as a Service (RaaS).
  • Establishing IT automation, Integration, and orchestration for self-service, autonomous, and cognitive IT.
  • Accomplishing log, operational, and performance/scalability analytics using AI (machine and deep learning). Algorithms for producing real-time, predictive, and prescriptive insights.
  • Building technology sponsored solutions for enabling NoOps, ChatOps, and AIOps. The challenge is to bring forth viable and versatile solutions in the form of automated tools for fulfilling their unique requirements.
  • Container clustering, orchestration, and management platform solutions for producing, deploying, and sustaining microservices-centric software applications. 
  • Bringing forth versatile software solutions such as standards-compliant service mesh solutions, API gateways, and management suites for ensuring service resiliency. With more microservices and their instances across containers (service run time), the operational complexity is on the rise. 
  • Building resilient and reliable software through pioneering programming techniques such as reactive programming and architectural styles such as event-driven architecture (EDA). 

The idea is to clearly illustrate the serious differences between agile programming, DevOps, and the SRE movement. There are several crucial challenges ahead, as we have mentioned. And the role and responsibility of the SRE technologies, tools, and tips are going to be strategic and significant toward making IT reliable, robust, and rewarding.