Software Architecture Patterns for Serverless Systems - Second Edition

By : John Gilbert

Software Architecture Patterns for Serverless Systems - Second Edition

By: John Gilbert

Overview of this book

Organizations undergoing digital transformation rely on IT professionals to design systems to keep up with the rate of change while maintaining stability. With this edition, enriched with more real-world examples, you’ll be perfectly equipped to architect the future for unparalleled innovation. This book guides through the architectural patterns that power enterprise-grade software systems while exploring key architectural elements (such as events-driven microservices, and micro frontends) and learning how to implement anti-fragile systems. First, you'll divide up a system and define boundaries so that your teams can work autonomously and accelerate innovation. You'll cover the low-level event and data patterns that support the entire architecture while getting up and running with the different autonomous service design patterns. This edition is tailored with several new topics on security, observability, and multi-regional deployment. It focuses on best practices for security, reliability, testability, observability, and performance. You'll be exploring the methodologies of continuous experimentation, deployment, and delivery before delving into some final thoughts on how to start making progress. By the end of this book, you'll be able to architect your own event-driven, serverless systems that are ready to adapt and change.

Preface

Who this book is for

What this book covers

To get the most out of this book

Architecting for Innovation

Continuously delivering business value

Dissecting lead time

Dissecting integration styles

Enabling autonomous teams with autonomous services

Summary

Free Chapter

Defining Boundaries and Letting Go

Learning the hard way

Building on proven concepts

Thinking about events first

Dividing a system into autonomous subsystems

Creating subsystem bulkheads

Dissecting an autonomous subsystem

Dissecting an autonomous service

Governing without impeding

Summary

Taming the Presentation Tier

Presentation tier innovation – zigzagging through time

Breaking up the frontend monolith

Dissecting micro frontends

Designing for offline-first

Summary

Trusting Facts and Eventual Consistency

Living in an eventually consistent world

Publishing to an event hub

Dissecting the Event Sourcing pattern

Event streams

Processing event streams

Designing for failure

Optimizing throughput

Summary

Turning the Cloud into the Database

Fighting data gravity

Embracing the data life cycle

Turning the database inside out

Dissecting the CQRS pattern

Keeping data lean

Implementing idempotence and order tolerance

Modeling data for operational performance

Leveraging change data capture

Summary

A Best Friend for the Frontend

Focusing on user activities

Dissecting the Backend for Frontend pattern

Dissecting function-level nano architecture

Choosing between REST and GraphQL

Implementing different kinds of BFF services

Summary

Bridging Intersystem Gaps

Creating an anti-corruption layer

Dissecting the External Service Gateway pattern

Integrating with third-party systems

Integrating with other subsystems

Integrating across cloud providers

Integrating with legacy systems

Providing an open API and SPI

Tackling common data challenges

Summary

Reacting to Events with More Events

Promoting inter-service collaboration

Dissecting the Control Service pattern

Orchestrating business processes

Employing the Saga pattern

Calculating event-sourcing snapshots

Implementing complex event processing (CEP) logic

Leveraging machine learning (ML) for control flow

Summary

Running in Multiple Regions

Justifying multi-regional deployment

Choosing a regional topology

Preparing for regional failover

Checking regional health

Configuring regional routing

Replicating across regions

Dissecting regional failover

Addressing intersystem differences

Implementing multi-regional cron jobs

Summary

Securing Autonomous Subsystems in Depth

Shared responsibility model

Securing cloud accounts

Securing CI/CD pipelines

Securing the perimeter

Securing the frontend

Securing BFF services

Redacting sensitive data

Securing ESG services

Auditing continuously

Summary

Choreographing Deployment and Delivery

Optimizing testing for continuous deployment

Focusing on risk mitigation

Achieving zero-downtime deployments

Planning at multiple levels

Turning the crank

Dissecting CI/CD pipelines

Summary

Optimizing Observability

Failing forward fast

Turning observability inside out

Leveraging FinOps

Collecting resource metrics

Tracking system events

Alerting on work metrics

Observing real user activity

Tuning continuously

Summary

Don’t Delay, Start Experimenting

Gaining trust and changing culture

Funding products, not projects

Dissecting the Strangler pattern

Addressing event-first concerns

Poly everything

Summary

Other Books You May Enjoy

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Dividing a system into autonomous subsystems

The goal of software architecture is to define boundaries that enable the components of the system to change independently. We could dive straight down into defining the individual services of the system, but as the number of services grows the system will become unwieldy. Doing so contributes to the creation of microliths and microservice death stars. As architects, our job is to facilitate change, which includes architecting a manageable system.

We need to step back and look at the bigger picture. We must break a complex problem down into ever-smaller problems that we can solve individually and then combine into the ultimate solution. We need to divide the system into a manageable set of high-level subsystems that each has a single reason to change. These subsystems will constitute the major bounded contexts of the system. We will apply the SRP along different dimensions to help us arrive at boundaries that enable change. This will facilitate organizational scale with separate groups managing the individual subsystems.

We also need our subsystems to be autonomous, in much the same way that we create autonomous services. This will give autonomous organizations the confidence to continuously innovate within their subsystems. We will accomplish this by creating bulkheads between the subsystems. A system resembling the event-first topology depicted in Figure 2.3 will begin to emerge. The purpose of each subsystem will be clear, and the subsystem architecture will allow the system to evolve in a dynamic business environment.

Let’s look at some ways we can divide up a system.

By actor

A logical place to start carving up a system into subsystems is along the external boundaries with the external actors. These actors are the users and the external systems that directly interact with the system. Following the SRP, each subsystem might be responsible to one and only one actor.

In our event storming example earlier, we identified a set of domain events for a food delivery system. During the event storming workshop, we would also identify the users (yellow) and external systems (pink) that produce or consume those events, such as in Figure 2.4. In this example, we might have a separate subsystem for each category of the user: Customer, Driver, and Restaurant. We may also want a subsystem for each category of the external system, such as relaying orders to the restaurant’s ordering systems, processing payments, and pushing notifications to customers:

Figure 2.4: System context diagram

Of course, this is a simple example. Enterprise systems may have many kinds of users and lots of external systems, including legacy and third-party systems. In this case, we will need to look for good ways to organize actors into cohesive groups and these groups may align with the business units.

By business unit

Another good place to look for architectural boundaries is between business units. A typical organizational chart can provide useful insights. Each unit will ultimately be the business owner of its subsystems and thus they will have a significant impact on when and how the system changes.

Keep in mind that we are interested in the organization of the company, not the IT department. Conway’s Law states organizations are constrained to produce designs which are copies of their communication structure. We have seen that the communication structure leads to dependencies, which increases lead time and reduces the pace of innovation. So, we want to align each autonomous subsystem with a single business unit. We often refer to this approach as the Inverse Conway Maneuver.

However, the organizational structure of a company can be unstable. A company may reorganize its business units for a variety of reasons. So, we should look deeper into the work the business units actually perform.

By business capability

Ultimately, we want to draw our architectural boundaries around the actual business capabilities that the company provides. Each autonomous subsystem should encapsulate a single business capability or at most a set of highly cohesive capabilities.

Going back to our event-storming approach and our event-first thinking, we are looking for logical groupings of related events (that is, verbs). There will be high temporal cohesion within these sets of domain events. They will be initiated by a group of related actors that are working together to complete an activity. For example, in our food delivery example, a driver may interact with a dispatch coordinator to ensure that a delivery is successful.

The key here is that the temporal cohesion of the activities within a capability helps to ensure that the components of a subsystem will tend to change together. This cohesion allows us to scale the SRP to the subsystem level when there are many different actors. The individual services within a subsystem will be responsible to the individual actors, whereas a subsystem is responsible to a single set of actors that work together in a business process (purple) to deliver a business capability:

Figure 2.5: Capabilities subsystems

Figure 2.5 depicts our food delivery system from the capabilities perspective. It is similar to the system context diagram in Figure 2.4, but the functionality is starting to take shape. However, we may find more subsystems when we look at the system from the perspective of the data life cycle.

By data life cycle

Another place to look for architectural boundaries is along the data life cycle. Over the course of the life of a piece of data, the actors that use and interact with the data will change and so will their requirements. Bringing the data life cycle into the equation will help uncover some overlooked subsystems. We will usually find these subsystems near the beginning and the end of the data life cycle. In essence, we are applying the SRP all the way down to the database level. We want to discover all the actors that interact with the data so that we can find all the actors and isolate these sources of change into their own bounded contexts (that is, an autonomous subsystem).

Going back to event-first thinking, we are stepping back and taking a moment to focus on the nouns (that is, domain aggregates) so that we can discover more verbs (that is, domain events) and the actors that produce those events. This will help find what I refer to as slow data (green). We typically zero in on the fast data (tan). This is the transactional data in the system that actors are continuously creating. However, the transactional data often relies on reference data and government regulation may impose records management requirements on how long we must retain the transactional data. We want to decouple these sources of change so that they do not impact the flexibility and performance of the transactional and analytics data. We will cover this topic in detail in Chapter 5, Turning the Cloud into the Database:

Figure 2.6: Data life cycle subsystems

Figure 2.6 depicts subsystems from the data life cycle perspective. We will likely need a subsystem upstream that owns the master data model that all downstream subsystems use as reference data. And all the way downstream we will usually have an analytics subsystem and a records management subsystem. In the middle lies the transactional subsystems that provide the capabilities of the system, like we saw in Figure 2.5. We will also want to carve out subsystems for any legacy systems.

By legacy system

Our legacy systems are a special case. In Chapter 7, Bridging Intersystem Gaps, and Chapter 13, Don’t Delay, Start Experimenting, we will cover an event-first migration pattern for integrating legacy systems known as the Strangler pattern. Without going into detail here, we are creating an anti-corruption layer around the legacy systems that enable them to interact with the new system by producing and consuming domain events. This creates an evolutionary migration path that minimizes risk by keeping the legacy systems active and synchronized until we are ready to decommission them. This extends the substitution principle to the subsystem level because we can simply remove the legacy subsystem once the migration is complete.

We are essentially treating the legacy systems as an autonomous subsystem with bulkheads that we design to eliminate coupling between the old and the new and to protect the legacy infrastructure by controlling the attack surface and providing backpressure. We will use the same bulkhead techniques we are using for all subsystems: separate cloud accounts and external domain events.

Let’s look at how we can create these subsystem bulkheads next.

Software Architecture Patterns for Serverless Systems - Second Edition

By : John Gilbert

Software Architecture Patterns for Serverless Systems - Second Edition

By: John Gilbert

Overview of this book

Related Content you might be interested in

Current Title:

Software Architecture Patterns for Serverless Systems - Second Edition

Cloud Native Development Patterns and Best Practices

JavaScript Cloud Native Development Cookbook

Embracing Microservices Design

Dividing a system into autonomous subsystems

By actor

By business unit

By business capability

By data life cycle

By legacy system