Book Image

PostgreSQL 12 High Availability Cookbook - Third Edition

By : Shaun Thomas
Book Image

PostgreSQL 12 High Availability Cookbook - Third Edition

By: Shaun Thomas

Overview of this book

Databases are nothing without the data they store. In the event of an outage or technical catastrophe, immediate recovery is essential. This updated edition ensures that you will learn the important concepts related to node architecture design, as well as techniques such as using repmgr for failover automation. From cluster layout and hardware selection to software stacks and horizontal scalability, this PostgreSQL cookbook will help you build a PostgreSQL cluster that will survive crashes, resist data corruption, and grow smoothly with customer demand. You’ll start by understanding how to plan a PostgreSQL database architecture that is resistant to outages and scalable, as it is the scaffolding on which everything rests. With the bedrock established, you'll cover the topics that PostgreSQL database administrators need to know to manage a highly available cluster. This includes configuration, troubleshooting, monitoring and alerting, backups through proxies, failover automation, and other considerations that are essential for a healthy PostgreSQL cluster. Later, you’ll learn to use multi-master replication to maximize server availability. Later chapters will guide you through managing major version upgrades without downtime. By the end of this book, you’ll have learned how to build an efficient and adaptive PostgreSQL 12 database cluster.
Table of Contents (17 chapters)

Architectural Considerations

In many ways, database server architecture is treated as a mere afterthought. It's often much easier to simply create a single node, install some software, and consider the whole affair resolved. If a company is particularly paranoid, they may even spare some thought for a replica server, or perhaps some kind of backup.

The true importance of database cluster architecture is easily overlooked as a result. But what is server architecture? Why does it matter?

Look down the street. Any street is fine. What do you see? Homes, offices, and buildings of various descriptions. With very rare exceptions, each one of these was meticulously planned, from the foundation to the walls to the electrical wires, pipes, up to the roof and drainage systems. A failure in any of these components could lead to the ultimate demise of the entire structure, given enough time.

The same also applies to a PostgreSQL cluster! Database architecture defines what goes into a database server cluster, and the reason for each element. How does it communicate? How many nodes are required? Where do we put those nodes, and why? What common problems are inherent in those decisions? How will our decisions influence the underlying cost? What trade-offs can we make, given some important constraints? How does all of this affect data availability? We need those answers before we even consider hardware or virtualization. There are many important considerations we must entertain when designing a highly available PostgreSQL cluster.

Why then is it so common for critical application and user data that drives the entire application stack behind the company itself to be treated so callously? We direct so much attention and focus on the application, with its various layers of indirection, queues, caches, container automation, and microarchitecture, that the data layer is overlooked or considered a nuisance.

This is actually highly understandable. In most cases, a PostgreSQL database layer demands an entirely different approach that development, system administration, and other information technology fields may not be entirely familiar with managing. Even experienced database administrators may not comprehend the scale and necessary theoretical concepts that drive the high availability of databases.

While we can't reduce the subtle art of database server architecture to a few memorable quips sure to entertain at parties, we can make the subject far more approachable. It shouldn't be necessary to have a Ph.D. in abstract theoretical frameworks to prevent a costly database outage.

In this chapter, we will learn how the layout of the nodes in our PostgreSQL cluster can drastically influence its availability. We will cover the following recipes:

  • Setting expectations with RPO
  • Defining timetables through RTO
  • Picking redundant copies
  • Selecting locations
  • Having enough backups
  • Considering quorum
  • Introducing indirection
  • Preventing split brain
  • Incorporating multi-master
  • Leveraging multi-master