-
Book Overview & Buying
-
Table Of Contents
The Platform Engineer's Handbook
By :
Welcome to The Platform Engineer's Handbook. Platform engineering is the discipline of designing, building, and operating internal engineering platforms that turn the messy reality of modern infrastructure into self-service capabilities developers actually want to use. It sits at the crossroads of software engineering, infrastructure, and product thinking, and it is quickly becoming the operating model that mature engineering organizations adopt to ship faster, more safely, and at scale.
This is a hands-on, practical book. Rather than describing platform engineering at a strategic level, we are going to build one. Together, we will design, deploy, and progressively harden a working engineering platform that runs on your local machine, exercising every capability through a real demo application as the platform evolves. The journey is organized around the three things every successful platform team has to get right:
Part 1 of the book, Designing, Building, and Deploying the Core Engineering Platform, lays down the foundations: the design principles that separate a real engineering platform from a pile of automation, a Kubernetes-based runtime with a service mesh, secure platform access through OAuth, and end-to-end observability with Prometheus, Loki, Tempo, and Grafana. We close Part 1 by deploying a demo application onto the platform and evaluating the developer experience as a first-class outcome.
Part 2 turns that core into something developers actually want to use. We deploy and curate a Backstage developer portal, build self-service onboarding so new teams can spin up everything they need with a single command, treat CI/CD as a composable platform service, layer in self-service infrastructure with Crossplane, and publish opinionated starter kits to the portal so a brand-new project can ship to production from day one.
Part 3 takes the platform from useful to enterprise-ready. We validate compliance with policy as code using Open Policy Agent, optimize cost and performance with KubeCost and autoscaling, automate resiliency through SLOs, backup and restore, and chaos engineering, and finally explore how AI agents and generative tooling can augment the platform itself, from copilots for pipelines to agentic incident response.
Throughout the book, you will work with a curated, vendor-agnostic toolchain — Kubernetes (via Kind), Istio, Auth0, Prometheus, Loki, Tempo, Grafana, Backstage, GitHub Actions, Crossplane, OPA Gatekeeper, KubeCost, Sloth, Velero, and others — chosen because they let you build a credible MVP platform on a single developer machine. The patterns and decisions we work through, however, transfer cleanly to whatever stack your organization has standardized on.
The complete code accompanying every chapter is published on the book's GitHub repository, organized chapter by chapter in the code/ folders, so you can follow along, fork what works for you, and skip what does not. Do not be discouraged by the 50,000 lines of code you see in the repo. Once you understand the structure and premise of this, it will become very easy to follow along.
The book is ambitious, built around a single, evolving worked example: an MVP engineering platform that you assemble piece by piece, deploying a demo microservices application onto it again and again as the platform matures. Each chapter ends with a tangible artifact you can keep and a clear before-and-after view of how the change affected the developer experience.
In writing this book, my goal has been to share the patterns, trade-offs, and hard-won lessons I have collected while planning, sponsoring, and building engineering platforms, which now includes supervising AI agents for engineering platforms across very different organizations. The aim is to give you a book you can read end to end, but also return to as a working reference whenever you are wrestling with a specific decision on your own platform.
This book is for practicing intermediate-to-senior platform engineers, software engineers, SREs, and DevOps practitioners with prior experience in cloud-native software development. Whether you are a developer with solid software engineering fundamentals or a DevOps practitioner looking to enable teams at scale, this book equips you with the practical knowledge and patterns needed to design and build engineering platforms that are relevant, usable, and impactful for their end users. It is geared toward hands-on practitioners who are in the code, shaping the infrastructure, processes, and techniques that enable product delivery, rather than high-level decision-makers.
Chapter 1, Platform Engineering: Laying the Groundwork, defines what platform engineering means in the context of this book, introduces the high-level architecture of a modern engineering platform, and walks through the source-control foundation — domain-bounded repositories, commit conventions, branching and tagging strategy, and quality gates — that everything else in the book builds on.
Chapter 2, Scalable Platform Runtime with Kubernetes and Service Mesh, deploys the core Kubernetes runtime for the platform, separates platform environments from user environments, and configures a service mesh for ingress and cross-tenant communication, giving you a solid and scalable foundation for everything that follows.
Chapter 3, Securing Platform Access, secures access to the platform itself, covering identity and access management with OAuth via Auth0, secrets and certificate management, zero-trust networking, and audit and compliance logging, with least-privilege access and automated TLS-secured ingress as the defaults.
Chapter 4, Embedding Observability, embeds observability into the platform from day one. You will deploy Prometheus, Loki, Tempo, and Grafana, configure SSO across the observability tooling, gather automatic telemetry, and build dashboards for multiple personas, including developers, security, and operations.
Chapter 5, Evaluate the User Experience, establishes the developer-experience baseline by deploying a microservices demo application to the platform as a user would, instrumenting it for observability, and exposing it on a public URL — the first of many checkpoints where you will experience your own platform from the outside in.
Chapter 6, Accelerating DevEx: Deploying and Curating Your First Developer Portal, deploys a Backstage developer portal, configures it incrementally with role-based navigation, SSO, and a service catalog, and publishes the demo application to the portal automatically when it deploys, turning the platform into a single pane of glass for your engineering organization.
Chapter 7, Self-Service Platform Onboarding, builds the platform's first custom service: an API-driven onboarding workflow that lets users grant platform access, scaffold source repositories, and provision deployment namespaces with a single operation, dramatically reducing time-to-first-deploy for new teams.
Chapter 8, CI/CD as a Platform Service, turns CI/CD into a true platform capability through reusable, versioned, composable pipeline tasks and templates, enabling progressive delivery and rollbacks, embedding compliance gates by default, and adding observability to the pipelines themselves.
Chapter 9, Self-Service Infrastructure Management, enables teams to deploy databases, caches, AI resources, and other infrastructure on demand using Crossplane and configuration-driven blueprints, with sensible per-environment defaults and guardrails that protect governance without blocking innovation.
Chapter 10, Publishing Starter Kits, closes the inner loop by creating versioned starter kits for common application types and publishing them to the developer portal so a brand-new project arrives with a codebase, configuration, CI/CD pipeline, and production-ready defaults already wired up.
Chapter 11, Validating Compliance with Policy-as-Code, deploys OPA Gatekeeper as an admission controller, teaches you to author policies in Rego, shifts policy testing left so non-compliant applications never reach the platform, and publishes compliance dashboards alongside the rest of the observability stack.
Chapter 12, Optimize Cost, Performance, and Scalability, adds FinOps observability with OpenCost, configures horizontal and vertical pod autoscaling, walks through right-sizing and the use of spot and large-capacity instances, and sets up cost dashboards and alerts so cost becomes a first-class engineering concern instead of a finance afterthought.
Chapter 13, Resilience Automation, defines and publishes SLOs with Sloth, automates backup and restore with Velero, runs chaos engineering experiments against the demo application, and works through disaster-recovery patterns that let you restore the entire platform to a known-good state.
Chapter 14, Agentic and AI-Augmented Platforms, looks at how AI agents and generative tooling are reshaping the platform itself, covering AI-assisted CI/CD pipeline generation, agent-based incident triage and root-cause analysis, embedded code generation, governance and observability for AI-augmented capabilities, and the metrics that tell you whether any of it is actually working.
The code bundle for the book is hosted on GitHub at https://github.com/achankra/peh We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here:https://packt.link/gbp/9781806380138
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: " The test suite in the GitHub repo for this chapter, test_templates.py, validates starter kit templates at two levels."
A block of code is set as follows:
port:
title: Port
type: number
description: Service port
default: 8080
Any command-line input or output is written as follows:
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/v3.14.0/deploy/gatekeeper.yaml
Bold: Indicates a new term, an important word, or words that you see on the screen. For instance, words in menus or dialog boxes appear in the text like this. For example: "Use the Backstage scaffolder when you're operating at organizational scale"
Warnings or important notes appear like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book or have any general feedback, please email us at [email protected] and mention the book's title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packt.com/submit-errata, click Submit Errata, and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packt.com/.
Once you've read The Platform Engineer's Handbook, we'd love to hear your thoughts! Scan the QR code below to go straight to the Amazon review page for this book and share your feedback.

https://packt.link/r/1806380137
Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.
Change the font size
Change margin width
Change background colour