Book Image

PostgreSQL 12 High Availability Cookbook - Third Edition

By : Shaun Thomas
Book Image

PostgreSQL 12 High Availability Cookbook - Third Edition

By: Shaun Thomas

Overview of this book

Databases are nothing without the data they store. In the event of an outage or technical catastrophe, immediate recovery is essential. This updated edition ensures that you will learn the important concepts related to node architecture design, as well as techniques such as using repmgr for failover automation. From cluster layout and hardware selection to software stacks and horizontal scalability, this PostgreSQL cookbook will help you build a PostgreSQL cluster that will survive crashes, resist data corruption, and grow smoothly with customer demand. You’ll start by understanding how to plan a PostgreSQL database architecture that is resistant to outages and scalable, as it is the scaffolding on which everything rests. With the bedrock established, you'll cover the topics that PostgreSQL database administrators need to know to manage a highly available cluster. This includes configuration, troubleshooting, monitoring and alerting, backups through proxies, failover automation, and other considerations that are essential for a healthy PostgreSQL cluster. Later, you’ll learn to use multi-master replication to maximize server availability. Later chapters will guide you through managing major version upgrades without downtime. By the end of this book, you’ll have learned how to build an efficient and adaptive PostgreSQL 12 database cluster.
Table of Contents (17 chapters)

Having enough backups

Database backups are a crucial component to any architecture, and should be considered a required part of the central design. The only real question in most cases is: how many backups? All highly available clusters account for relevant backup copies, lest the cluster itself is lost.

In this recipe, we'll cover one simple set of rules to provide an answer.

Getting ready

This is very important, so write it down if necessary. Put it in company documentation if possible.

How to do it...

When considering how many backups to allocate, follow the 3-2-1 backup rule, which consists of these elements:

  1. Keep at least three copies of your data.
  2. Store two copies on different devices.
  3. Keep at least one copy offsite.

Take note that many things may qualify as a copy, including PostgreSQL replicas, and the original data itself.

How it works...

Notice how we don't really recommend a specific backup method, how it should be stored, or how filesystem-based features might contribute. All of those things are implementation details and don't matter as much as the rules themselves.

Consider the first rule: keep at least three copies of your data. Since our PostgreSQL instance is the first copy, we need two more. What might these two be? Could we use one replica and a backup? Maybe we could use two replicas? Perhaps, but let's examine the remaining rules first.

Imagine we've produced a PostgreSQL backup, and it's stored on our primary server and the same filesystem as the database instance. What happens if that storage device is damaged or destroyed? Now, we've lost the database and the backup.

That naturally leads to the second rule: store two copies on different devices. It's fine to retain a copy on the local PostgreSQL server and even the same physical storage device, provided we store a copy of the backup on a device that won't be lost simultaneously. Store another copy of the backup on a second physical device. This can be a separate set of storage drives, a SAN, a shared NFS filesystem, or anything else, so long as it's separate from the database itself.

Be wary of relying on shared infrastructure for following the second rule. If we have two separate LUNs from the same SAN mounted on our database server for PostgreSQL and the backup, this means nothing if the SAN itself is lost. If possible, try to ensure that the backup is actually on a physically distinct device.

So, to fulfill the second rule, we merely need to ensure that the second copy of our PostgreSQL instance is on another device. This is most easily done by creating a replica on another server, or a VM hosted on a different hypervisor. That's two copies of the data.

Finally there's rule three: keep at least one copy offsite. This is the third copy of our data, and it's best to place it somewhere that's immune from a catastrophic failure of the data center itself. In limited circumstances, it may be safe enough to place the backup on a server in another rack of the same data center, but why take the risk?

There are ample cloud providers, vault systems, and cheap storage services that can fill the role of hosting the third and final copy of our data. If we have our own second data center, that's an easy choice of venue. If not, it's important to select, allocate, and designate some tertiary location that won't be lost if the worst happens. This third data copy is an insurance policy, and it doesn't even have to be easily available. As long as we can obtain the backup upon request, that fits the minimum requirement.

There's more...

There's one important corollary here. PostgreSQL replicas tend to immediately reflect every change made to the primary node. What happens if someone accidentally drops a table? In this case, the 3-2-1 rule by itself is not sufficient. Relying on replicas alone means we've permanently lost this data.

Thus, we strongly recommend the following additional rule: At least one copy must be a true backup.

Databases such as PostgreSQL are equipped with Point-In-Time-Recovery (PITR), which allows the user to start with any past backup and apply changes until it reaches a specific point where recovery is stopped. This lets us recover a backup to the point before a table was damaged or removed, capture the desired contents, and reintroduce them into the original database. This can only be done with a real binary-level backup, and there are multiple tools dedicated to performing this task. This book even contains a chapter dedicated to Backup Management.

See also