Software Architecture Patterns for Serverless Systems - Second Edition

By : John Gilbert

Software Architecture Patterns for Serverless Systems - Second Edition

By: John Gilbert

Overview of this book

Organizations undergoing digital transformation rely on IT professionals to design systems to keep up with the rate of change while maintaining stability. With this edition, enriched with more real-world examples, you’ll be perfectly equipped to architect the future for unparalleled innovation. This book guides through the architectural patterns that power enterprise-grade software systems while exploring key architectural elements (such as events-driven microservices, and micro frontends) and learning how to implement anti-fragile systems. First, you'll divide up a system and define boundaries so that your teams can work autonomously and accelerate innovation. You'll cover the low-level event and data patterns that support the entire architecture while getting up and running with the different autonomous service design patterns. This edition is tailored with several new topics on security, observability, and multi-regional deployment. It focuses on best practices for security, reliability, testability, observability, and performance. You'll be exploring the methodologies of continuous experimentation, deployment, and delivery before delving into some final thoughts on how to start making progress. By the end of this book, you'll be able to architect your own event-driven, serverless systems that are ready to adapt and change.

Preface

Who this book is for

What this book covers

To get the most out of this book

Architecting for Innovation

Continuously delivering business value

Dissecting lead time

Dissecting integration styles

Enabling autonomous teams with autonomous services

Summary

Free Chapter

Defining Boundaries and Letting Go

Learning the hard way

Building on proven concepts

Thinking about events first

Dividing a system into autonomous subsystems

Creating subsystem bulkheads

Dissecting an autonomous subsystem

Dissecting an autonomous service

Governing without impeding

Summary

Taming the Presentation Tier

Presentation tier innovation – zigzagging through time

Breaking up the frontend monolith

Dissecting micro frontends

Designing for offline-first

Summary

Trusting Facts and Eventual Consistency

Living in an eventually consistent world

Publishing to an event hub

Dissecting the Event Sourcing pattern

Event streams

Processing event streams

Designing for failure

Optimizing throughput

Summary

Turning the Cloud into the Database

Fighting data gravity

Embracing the data life cycle

Turning the database inside out

Dissecting the CQRS pattern

Keeping data lean

Implementing idempotence and order tolerance

Modeling data for operational performance

Leveraging change data capture

Summary

A Best Friend for the Frontend

Focusing on user activities

Dissecting the Backend for Frontend pattern

Dissecting function-level nano architecture

Choosing between REST and GraphQL

Implementing different kinds of BFF services

Summary

Bridging Intersystem Gaps

Creating an anti-corruption layer

Dissecting the External Service Gateway pattern

Integrating with third-party systems

Integrating with other subsystems

Integrating across cloud providers

Integrating with legacy systems

Providing an open API and SPI

Tackling common data challenges

Summary

Reacting to Events with More Events

Promoting inter-service collaboration

Dissecting the Control Service pattern

Orchestrating business processes

Employing the Saga pattern

Calculating event-sourcing snapshots

Implementing complex event processing (CEP) logic

Leveraging machine learning (ML) for control flow

Summary

Running in Multiple Regions

Justifying multi-regional deployment

Choosing a regional topology

Preparing for regional failover

Checking regional health

Configuring regional routing

Replicating across regions

Dissecting regional failover

Addressing intersystem differences

Implementing multi-regional cron jobs

Summary

Securing Autonomous Subsystems in Depth

Shared responsibility model

Securing cloud accounts

Securing CI/CD pipelines

Securing the perimeter

Securing the frontend

Securing BFF services

Redacting sensitive data

Securing ESG services

Auditing continuously

Summary

Choreographing Deployment and Delivery

Optimizing testing for continuous deployment

Focusing on risk mitigation

Achieving zero-downtime deployments

Planning at multiple levels

Turning the crank

Dissecting CI/CD pipelines

Summary

Optimizing Observability

Failing forward fast

Turning observability inside out

Leveraging FinOps

Collecting resource metrics

Tracking system events

Alerting on work metrics

Observing real user activity

Tuning continuously

Summary

Don’t Delay, Start Experimenting

Gaining trust and changing culture

Funding products, not projects

Dissecting the Strangler pattern

Addressing event-first concerns

Poly everything

Summary

Other Books You May Enjoy

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Governing without impeding

As architects, once we have defined the architectural boundaries of the system, we need to let go and get out of the way, unless we want to become an impediment to innovation. But letting go is difficult. It goes against our nature; we like to be hands-on. And it flies in the face of traditional governance techniques. But we must let go for the sake of the business, whether the business realizes this or not.

Governance has an understandable reputation for getting in the way of progress and innovation. Although it has good intentions, the traditional manual approach to governance actually increases risk, instead of reducing it, because it increases lead time, which diminishes an organization’s ability to react to challenges in a modern dynamic environment. But it doesn’t have to be this way.

We have already taken major strides to mitigate the risks of continuous innovation. We define architectural boundaries that limit the scope of any given change, and we fortify these boundaries to control the blast radius when honest human errors happen. We do this because we know that to err is human. We know that mistakes are inevitable, no matter how rigorous a governance process we follow.

Instead of impeding innovations, we must empower teams with a culture and a platform that embraces continuous governance. This is a safety net that gives teams and management confidence to move forward, knowing that we can catch mistakes and make corrections in real time. Automation and observability are the key elements of continuous governance. Let’s see how we can put this safety net in place and foster a culture of robustness.

Providing automation and cross-cutting concerns

A major objective of governance is to ensure that a system is compliant with regulations and best practices. These include the typical -ilities, such as scalability and reliability, and of course security, along with regulations such as NIST, PCI, GDPR, and HIPAA. The traditional approach includes manual audits of the architecture. These gates are the reason governance has a reputation for impeding progress. They are labor intensive and worse yet; they are error prone.

Fortunately, we now have a better option. Our deployments are fully automated by our CI/CD pipelines. This is already a significant improvement in quality because Infrastructure as Code reduces human error and enables us to quickly fail forward. We still have some manual gates for each deployment.

The first gate is code review and approval of a pull request. We perform this gate quickly because each task branch has a small batch size. The second gate is the certification of a regional canary deployment. We deploy to one region for continuous smoke testing before deploying to other regions. We will cover CI/CD pipelines in detail in Chapter 11, Choreographing Deployment and Delivery.

We also have observability, which provides timely, actionable information so that we know when to jump into action and we can recover quickly. We will cover this in Chapter 12, Optimizing Observability. We will take automation further and harden our build processes by adding continuous auditing and securing the perimeter of our subsystems and our cloud accounts. We will cover these topics in Chapter 10, Securing Autonomous Subsystems in Depth.

However, these are all cross-cutting concerns, and we don’t want teams to reinvent these capabilities for each autonomous subsystem. We need a dedicated team with the knowledge and specialized skills to manage an integrated suite of SaaS tools, stamp out accounts with a standard set of capabilities, and maintain these cross-cutting concerns for use across the accounts. Yet, the owners of each autonomous subsystem must have control over when to apply changes to their accounts and have the flexibility to override and/or enhance features as their circumstances dictate.

Even with these cross-cutting concerns in place, the reality is that many aspects of the approach and architecture are new and unfamiliar, so the next part of the governance equation is promoting a culture of robustness.

Promoting a culture of robustness

Our goal of increasing the pace of innovation leads us to a rapid feedback loop with small batch sizes and short lead times. We are deploying code much more frequently and these deployments must result in zero downtime. To eliminate downtime, we must uphold the contracts we have defined within the system. However, traditional versioning techniques fall apart in a dynamic environment with a high rate of change. Instead, we will apply the Robustness principle.

The Robustness principle states be conservative in what you send, be liberal in what you receive. This principle is well suited for continuous deployment, where we can perform a successive set of deployments to make a conforming change on one side of a contract, followed by an upgrade on the other side and then another on the first side to remove the old code. The trick is to develop a culture of robustness where this three-step dance is committed to team muscle memory and becomes second nature.

In Chapter 11, Choreographing Deployment and Delivery, we will cover a lightweight continuous delivery process that is geared for robustness. It includes three levels of planning, GitOps, CI/CD pipelines, regional canary deployment, and more. It forms a simple automated bureaucracy that governs each deployment but leaves the order of deployments completely flexible.

In my experience, autonomous teams are eager to adopt a culture of robustness, especially once they get a feel for how much more productive and effective they can become. But this is a paradigm shift, and it is unfamiliar from a traditional governance perspective. Everyone must have the confidence to move at this pace. As architects, we need to be evangelists and promote this cultural change, both upstream and downstream. We need to educate everyone on how everything we are doing comes together to provide a safety net for continuous discovery.

Finally, let’s see how metrics can guide governance.

Harnessing the four key team metrics

Observability metrics are an indispensable tool in modern software development. We cover this topic in detail in Chapter 12, Optimizing Observability. Autonomous teams are responsible for leveraging the observability metrics of their apps and services as a tool for self-governance and self-improvement. In my experience, teams truly value these insights and thrive on the continuous feedback.

From a systemwide governance perspective, we should focus our energy on helping teams that are struggling. In their book Measure Software Delivery Performance with Four Key Metrics (https://itrevolution.com/articles/measure-software-delivery-performance-four-key-metrics), Nicole Forsgren, Gene Kim, and Jez Humble put forth four metrics that we can harness to help us identify which teams may need more assistance and mentoring:

Lead time: How long does it take a team to complete a task and push the change to production?
Deployment rate: How many times a day is a team deploying changes to production?
Failure rate: How often does a deployment result in a failure that impacts a generally available feature?
Mean Time to Recovery (MTTR): When a failure does occur, how long does it take the team to fail forward with a fix?

The answers to these questions clearly indicate the maturity of a specific team. We certainly prefer lead time, failure rate, and MTTR to be low and deployment rate to be high. Teams that are having trouble with these metrics are usually going through their own digital transformation and are eager to receive mentoring and coaching. We can collect metrics from our issue-tracking software and CI/CD tool and track them alongside all the others in our observability tool.

Software Architecture Patterns for Serverless Systems - Second Edition

By : John Gilbert

Software Architecture Patterns for Serverless Systems - Second Edition

By: John Gilbert

Overview of this book

Related Content you might be interested in

Current Title:

Software Architecture Patterns for Serverless Systems - Second Edition

Cloud Native Development Patterns and Best Practices

JavaScript Cloud Native Development Cookbook

Embracing Microservices Design

Governing without impeding

Providing automation and cross-cutting concerns

Promoting a culture of robustness

Harnessing the four key team metrics