PostgreSQL High Availability Cookbook

PostgreSQL High Availability Cookbook - Second Edition

By : Shaun Thomas

Buy this Book

PostgreSQL High Availability Cookbook - Second Edition

By: Shaun Thomas

Buy this Book

Overview of this book

Databases are nothing without the data they store. In the event of a failure - catastrophic or otherwise - immediate recovery is essential. By carefully combining multiple servers, it’s even possible to hide the fact a failure occurred at all. From hardware selection to software stacks and horizontal scalability, this book will help you build a versatile PostgreSQL cluster that will survive crashes, resist data corruption, and grow smoothly with customer demand. It all begins with hardware selection for the skeleton of an efficient PostgreSQL database cluster. Then it’s on to preventing downtime as well as troubleshooting some real life problems that administrators commonly face. Next, we add database monitoring to the stack, using collectd, Nagios, and Graphite. And no stack is complete without replication using multiple internal and external tools, including the newly released pglogical extension. Pacemaker or Raft consensus tools are the final piece to grant the cluster the ability to heal itself. We even round off by tackling the complex problem of data scalability. This book exploits many new features introduced in PostgreSQL 9.6 to make the database more efficient and adaptive, and most importantly, keep it running.

Title Page

Credits

About the Author

About the Reviewer

www.Packtpub.com

Customer Feedback

Preface

Free Chapter

Hardware Planning

Introduction

Planning for redundancy

Making the most of memory

Exploring nimble networking

Managing motherboards

Handling and Avoiding Downtime

Introduction

Determining acceptable losses

Configuration - getting it right the first time

Configuration - managing scary settings

Identifying important tables

Defusing cache poisoning

Exploring the magic of virtual IPs

Terminating rogue connections

Reducing contention with concurrent indexes

Managing system migrations

Managing software upgrades

Mitigating the impact of hardware failure

Applying bonus kernel tweaks

Pooling Resources

Introduction

Determining connection costs and limits

Installing PgBouncer

Configuring PgBouncer safely

Connecting to PgBouncer

Listing PgBouncer server connections

Listing PgBouncer client connections

Evaluating PgBouncer pool health

Installing pgpool

Configuring pgpool for master/slave mode

Testing a write query on pgpool

Swapping active nodes with pgpool

Combining the power of PgBouncer and pgpool

Troubleshooting

Introduction

Performing triage

Installing common statistics packages

Evaluating the current disk performance with iostat

Tracking I/O-heavy processes with iotop

Viewing past performance with sar

Correlating performance with dstat

Interpreting /proc/meminfo

Examining /proc/net/bonding/bond0

Checking the pg_stat_activity view

Checking the pg_stat_statements view

Deciphering database locks

Debugging with strace

Logging checkpoints properly

Monitoring

Introduction

Figuring out what to monitor

Installing and configuring Nagios

Configuring Nagios to monitor a database host

Enhancing Nagios with check_mk

Getting to know check_postgres

Installing and configuring collectd

Adding a custom PostgreSQL monitor to collectd

Installing and configuring Graphite

Adding collectd data to Graphite

Building a graph in Graphite

Customizing a Graphite graph

Creating a Graphite dashboard

Replication

Introduction

Deciding what to copy

Securing the WAL stream

Setting up a hot standby

Upgrading to asynchronous replication

Bulletproofing with synchronous replication

Faking replication with pg_receivexlog

Setting up Slony

Copying a few tables with Slony

Setting up Bucardo

Copying a few tables with Bucardo

Setting up Londiste

Copying a few tables with Londiste

Setting up pglogical

Copying a few tables with pglogical

Replication Management Tools

Introduction

Deciding when to use third-party tools

Installing and configuring Barman

Backing up a database with Barman

Restoring a database with Barman

Installing and configuring OmniPITR

Managing WAL files with OmniPITR

Installing and configuring repmgr

Cloning a database with repmgr

Swapping active nodes with repmgr

Installing and configuring walctl

Cloning a database with walctl

Managing WAL files with walctl

Installing and configuring WAL-E

Managing WAL files with WAL-E

Simple Stack

Introduction

Preparing systems for the stack

Installing and configuring etcd

Installing and configuring Patroni

Installing and configuring HAProxy

Performing a managed failover

Using an outage to test availability

Adding a node back into the cluster

Adding additional nodes to the mix

Replacing etcd with ZooKeeper

Replacing etcd with Consul

Upgrading while staying online

Advanced Stack

Introduction

Preparing systems for the stack

Getting started with the Linux Volume Manager

Adding block-level replication

Incorporating the second LVM layer

Verifying a DRBD filesystem

Correcting a DRBD split brain

Formatting an XFS filesystem

Tweaking XFS performance

Maintaining an XFS filesystem

Using LVM snapshots

Switching live stack systems

Detaching a problematic node

Cluster Control

Introduction

Installing the necessary components

Configuring Corosync

Preparing startup services

Starting with base options

Adding DRBD to cluster management

Adding LVM to cluster management

Adding XFS to cluster management

Adding PostgreSQL to cluster management

Adding a virtual IP to hide the cluster

Adding an e-mail alert

Grouping associated resources

Combining and ordering related actions

Performing a managed resource migration

Using an outage to test migration

Data Distribution

Introduction

Identifying horizontal candidates

Setting up a foreign PostgreSQL server

Mapping a remote user

Creating a foreign table

Using a foreign table in a query

Optimizing foreign table access

Transforming foreign tables into local tables

Creating a scalable nextval replacement

Building a sharding API

Talking to the right shard

Moving a shard to another server

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Exploring nimble networking

The network card enables the database server to exchange data with the outside world. This includes far more than web servers, spreadsheets, loading jobs, application servers, and other data consumers. The database server is part of a large continuum of activity, much of which will center around maintenance, management, and even filesystem availability.

Little of this other traffic involves PostgreSQL directly. Much happens in the background regardless of the database and its current workload. Yet even one mishandled network packet across an otherwise normal driver can render the entire server invisible to the outside world, or in extreme cases, even lead to a system panic and subsequent shutdown. On a busy database server, network cards can handle several terabytes of traffic on a daily basis; the margin of error for such a critical piece of hardware is exceptionally slim.

What's more, network bandwidth can easily be saturated by an aggressive backup strategy, which is something critical to a highly-available database. For PostgreSQL systems utilizing streaming replication or WAL archival, that traffic contributes quite a bit of bandwidth to the overall picture. If our backups are delayed, or replicas sit idle waiting for network packets, our exposure to risk is high indeed.

That's not to say everything is doom and gloom! With the right network setup and accompanying hardware, there should be more than enough room for any and all traffic our database server needs. Let's explore all the copious options for connecting our database to the outside world, and making sure it stays there.

Getting ready

This is one of those times it pays to do research. At the time of writing, the current high-speed network standards include 1 Gb/s, 10 Gb/s, 40 Gb/s, and even 100 Gb/s Ethernet. However, 40 Gb/s network cards are still extremely rare, and 100 Gb/s is generally reserved for fiber-based switches and data center use.

This means we will be covering 1 Gb/s and 10 Gb/s interfaces. While we will do our best to outline all of the important aspects of these technologies to simplify the process, we strongly encourage using the Internet to validate current availability and performance characteristics.

How to do it...

Let's begin with a few basic calculations. Look at these following numbers that represent an estimate of interface speed after accounting for overhead:

1000 Mb/s * B/10 b = 100 MB/s
10,000 Mb/s * B/10 b = 1,000 MB/s

Next, consider how many ways this will be distributed. If we have an existing PostgreSQL setup, follow these steps:

Execute the following query to determine the number of existing replicas:

        SELECT count(1)+1 AS streams 
          FROM pg_stat_replication;

Multiply streams by 160 for maximum MB/s needed by replication streams.
Execute the following queries together in a psql connection during a busy time of day on a production database:

        SELECT SUM(pg_stat_get_db_tuples_fetched(oid)) AS count1 
          FROM pg_database; 
        SELECT pg_sleep(1); 
        SELECT SUM(pg_stat_get_db_tuples_fetched(oid)) AS count2 
          FROM pg_database;

Subtract the results of count1 from count2 for the number of rows fetched from the database per second.
Divide the number of rows per second by 10,000 for MB/s used by PostgreSQL connections.
Add MB/s for streams to MB/s for connections.

Without an existing database, follow these steps for some basic bandwidth numbers:

Multiply the desired number of PostgreSQL replicas by 160 for the maximum MB/s needed by replication streams.
Assume one WAL stream for an offsite disaster recovery database copy.
Start with at least one live hot streaming standby copy.
Include any additional database mirrors.
Estimate the active client count as discussed in the Picking a Processor recipe.
Multiply the active client count estimate by 5 for MB/s used by PostgreSQL connections.
Add MB/s for streams to MB/s for connections.

No matter which checklist we follow, we should double the final tally.

How it works...

If we have an existing database, there is a wealth of statistical information at our fingertips. The first query we ran gave us a slightly inflated count of copies of our database. For each copy, data must be transferred from the database to another server. This data is based on PostgreSQL WAL output, and these files are 16 MB each. A busy server can produce more than ten of these per second, so we multiply the count of streams by 160 to produce an aggressive amount of network overhead used by database replicas. As usual, this may be overzealous; it's always best to observe an actual system to measure maximum WAL segments generated during heavy write loads.

Note

In PostgreSQL 9.2 and higher, database replicas can stream from other database replicas. This means network traffic can be distributed better among streaming clients, reducing network bandwidth pressure on production systems. PostgreSQL 9.2 also allows direct backup of streaming replicas. This means one or two replicas may be the most the production database ever needs to supply with WAL traffic.

For the next set of numbers, we need to know how much data database connections commonly retrieve. PostgreSQL tracks the number of table rows fetched, but it's a cumulative total. By waiting until a busy time of day and asking the database how many rows have been fetched before and after a one-second wait, we know how many rows are fetched per second.

However, we still don't know how many bytes these rows consume. A good estimate of this is 100 bytes per row. Then we only have to multiply the number of rows by 100 to find the amount of bandwidth we would need. So why do we divide by 10,000? What's 10,000 multiplied by 100? One million. On dividing by 10,000, we produce the number of megabytes per second that those tuple fetches probably used.

Tip

If an average of 100 bytes per row isn't good enough, we can connect to one of our primary databases and ask what the average is. Use this query: SELECT sum(pg_relation_size(oid)) / sum(reltuples) FROM pg_class;

By adding the amount of streaming traffic to the amount of connection traffic, we have a good, if slightly inflated, idea of how much bandwidth the server needs.

Without a working database to go by, we need to use a few guesses instead. Luckily, the number of streams for a reliable database infrastructure starts at two: one for a live standby, and one for an offsite archive. Each additional desired mirror should increase this total. Again, we multiply by 160 to obtain the maximum megabytes per second that all these streams are likely to require.

The amount of bandwidth client connections use is slightly harder to estimate. However, if we worked through previous chapter sections, we have a CPU estimate, which also tells us the maximum number of database clients that the server can reliably support. If we take that value and multiply by five, that provides a rough value in megabytes per second as well.

Again, we just add those two totals together, and we know the minimum speed of our network.

Finally, we multiply the final tally by two, to account for any unknown maintenance, backup, and filesystem synchronization overhead.

There's more...

Besides producing an estimate through some simple calculations, we also want to make note of a few other networking details.

A networking example

This may be easier to visualize with a real example. Let's start with a very active database that has one streaming replica, and one offsite archive. Furthermore, connected clients regularly fetch five million rows per second. Now, let's go through our steps:

2 * 160 = 320 MB/s.
5,000,000 / 10,000 = 50 MB/s.
320 + 50 = 370 MB/s.
370 * 2 = 740 MB/s.

That's a very high value! A 1 Gb/s interface can only supply 100 MB/s at most, so we would need eight of those to produce the necessary bandwidth. Yet a 10 Gb/s interface can supply 1000 MB/s, so it can easily handle 740 MB/s, and have room to spare. Would we rather have eight network cables coming out of our server, or one?

Remembering redundancy

One of the first things this chapter suggested was to consider extra inventory. What we haven't really covered yet involves online backups. Most server-class motherboards include not just one, but two on-board network modules. Each module commonly provides four Ethernet interfaces.

Usually each interface is considered separate, and two interfaces from each module are connected to two switches in the data center. This allows server administrators to seamlessly perform maintenance on either switch without disrupting our network traffic. Furthermore, if a switch or network module fails, there's always a backup available.

In our working example, we would need eight 1 Gb/s interfaces to avoid experiencing network congestion. However, we've already used four of our eight available interfaces simply to satisfy basic server hosting requirements. That doesn't leave enough available capacity, and as a consequence, this server would experience a network bottleneck.

This would not be the case with a 10 Gb/s interface. Each of the interfaces connected to redundant switches can carry the entire network requirements of the server.

Saving the research

We suggested doing research on 1 Gb/s and 10 Gb/s network cards. Well, don't do too much. It's very likely that the infrastructure department already has a standard server profile for high-bandwidth systems. This is primarily due to the fact that 10 Gb/s is a very complicated standard compared to 1 Gb/s or lower. There are several different cable types available along with complimentary network modules, one or more of which are probably already deployed in the data center.

Just make sure that the infrastructure knows to allocate high-bandwidth resources if our calculations call for it.

PostgreSQL High Availability Cookbook - Second Edition

By : Shaun Thomas

PostgreSQL High Availability Cookbook - Second Edition

By: Shaun Thomas

Overview of this book

Related Content you might be interested in

Current Title:

PostgreSQL High Availability Cookbook - Second Edition

PostgreSQL 11 Administration Cookbook

PostgreSQL 13 Cookbook

PostgreSQL 14 Administration Cookbook

Exploring nimble networking

Getting ready

How to do it...

How it works...

Note

Tip

There's more...

A networking example

Remembering redundancy

Saving the research

See also