PostgreSQL High Availability Cookbook

PostgreSQL High Availability Cookbook - Second Edition

By : Shaun Thomas

Buy this Book

PostgreSQL High Availability Cookbook - Second Edition

By: Shaun Thomas

Buy this Book

Overview of this book

Databases are nothing without the data they store. In the event of a failure - catastrophic or otherwise - immediate recovery is essential. By carefully combining multiple servers, it’s even possible to hide the fact a failure occurred at all. From hardware selection to software stacks and horizontal scalability, this book will help you build a versatile PostgreSQL cluster that will survive crashes, resist data corruption, and grow smoothly with customer demand. It all begins with hardware selection for the skeleton of an efficient PostgreSQL database cluster. Then it’s on to preventing downtime as well as troubleshooting some real life problems that administrators commonly face. Next, we add database monitoring to the stack, using collectd, Nagios, and Graphite. And no stack is complete without replication using multiple internal and external tools, including the newly released pglogical extension. Pacemaker or Raft consensus tools are the final piece to grant the cluster the ability to heal itself. We even round off by tackling the complex problem of data scalability. This book exploits many new features introduced in PostgreSQL 9.6 to make the database more efficient and adaptive, and most importantly, keep it running.

Title Page

Credits

About the Author

About the Reviewer

www.Packtpub.com

Customer Feedback

Preface

Free Chapter

Hardware Planning

Introduction

Planning for redundancy

Making the most of memory

Exploring nimble networking

Managing motherboards

Handling and Avoiding Downtime

Introduction

Determining acceptable losses

Configuration - getting it right the first time

Configuration - managing scary settings

Identifying important tables

Defusing cache poisoning

Exploring the magic of virtual IPs

Terminating rogue connections

Reducing contention with concurrent indexes

Managing system migrations

Managing software upgrades

Mitigating the impact of hardware failure

Applying bonus kernel tweaks

Pooling Resources

Introduction

Determining connection costs and limits

Installing PgBouncer

Configuring PgBouncer safely

Connecting to PgBouncer

Listing PgBouncer server connections

Listing PgBouncer client connections

Evaluating PgBouncer pool health

Installing pgpool

Configuring pgpool for master/slave mode

Testing a write query on pgpool

Swapping active nodes with pgpool

Combining the power of PgBouncer and pgpool

Troubleshooting

Introduction

Performing triage

Installing common statistics packages

Evaluating the current disk performance with iostat

Tracking I/O-heavy processes with iotop

Viewing past performance with sar

Correlating performance with dstat

Interpreting /proc/meminfo

Examining /proc/net/bonding/bond0

Checking the pg_stat_activity view

Checking the pg_stat_statements view

Deciphering database locks

Debugging with strace

Logging checkpoints properly

Monitoring

Introduction

Figuring out what to monitor

Installing and configuring Nagios

Configuring Nagios to monitor a database host

Enhancing Nagios with check_mk

Getting to know check_postgres

Installing and configuring collectd

Adding a custom PostgreSQL monitor to collectd

Installing and configuring Graphite

Adding collectd data to Graphite

Building a graph in Graphite

Customizing a Graphite graph

Creating a Graphite dashboard

Replication

Introduction

Deciding what to copy

Securing the WAL stream

Setting up a hot standby

Upgrading to asynchronous replication

Bulletproofing with synchronous replication

Faking replication with pg_receivexlog

Setting up Slony

Copying a few tables with Slony

Setting up Bucardo

Copying a few tables with Bucardo

Setting up Londiste

Copying a few tables with Londiste

Setting up pglogical

Copying a few tables with pglogical

Replication Management Tools

Introduction

Deciding when to use third-party tools

Installing and configuring Barman

Backing up a database with Barman

Restoring a database with Barman

Installing and configuring OmniPITR

Managing WAL files with OmniPITR

Installing and configuring repmgr

Cloning a database with repmgr

Swapping active nodes with repmgr

Installing and configuring walctl

Cloning a database with walctl

Managing WAL files with walctl

Installing and configuring WAL-E

Managing WAL files with WAL-E

Simple Stack

Introduction

Preparing systems for the stack

Installing and configuring etcd

Installing and configuring Patroni

Installing and configuring HAProxy

Performing a managed failover

Using an outage to test availability

Adding a node back into the cluster

Adding additional nodes to the mix

Replacing etcd with ZooKeeper

Replacing etcd with Consul

Upgrading while staying online

Advanced Stack

Introduction

Preparing systems for the stack

Getting started with the Linux Volume Manager

Adding block-level replication

Incorporating the second LVM layer

Verifying a DRBD filesystem

Correcting a DRBD split brain

Formatting an XFS filesystem

Tweaking XFS performance

Maintaining an XFS filesystem

Using LVM snapshots

Switching live stack systems

Detaching a problematic node

Cluster Control

Introduction

Installing the necessary components

Configuring Corosync

Preparing startup services

Starting with base options

Adding DRBD to cluster management

Adding LVM to cluster management

Adding XFS to cluster management

Adding PostgreSQL to cluster management

Adding a virtual IP to hide the cluster

Adding an e-mail alert

Grouping associated resources

Combining and ordering related actions

Performing a managed resource migration

Using an outage to test migration

Data Distribution

Introduction

Identifying horizontal candidates

Setting up a foreign PostgreSQL server

Mapping a remote user

Creating a foreign table

Using a foreign table in a query

Optimizing foreign table access

Transforming foreign tables into local tables

Creating a scalable nextval replacement

Building a sharding API

Talking to the right shard

Moving a shard to another server

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Making the most of memory

The primary focus when selecting memory for a highly-available system is stability. It's no accident that most, if not all, server-class RAM is of the error-correcting variety. There are a few other things to consider, which may not appear obvious at first glance.

Due to the multi-core nature of our CPUs, the amount of addressable memory may depend on the core count. In addition, speed, latency, and parity are all considerations. We also must consider the number of channels reported by each CPU; failing to match this with an equal count of memory sticks will drastically reduce performance.

Let's make our server fast and stable by considering our memory options.

Getting ready

Some of the decisions we will make depend on the capabilities of the CPU. Make sure to read through the Picking a processor recipe before continuing. If we have a PostgreSQL database available, there's also a query that can prepare us for selecting the most advantageous count of memory modules. It's also a very good idea to complete the Sizing storage recipe to get a better idea for choosing an amount of memory.

How to do it...

We can collect some of the information we want from PostgreSQL if we have an install already. Follow these steps if there's an existing database install that we can use:

Execute the following query to obtain the size of all databases in the instance:

        SELECT pg_size_pretty(sum(pg_database_size(oid))::BIGINT) 
          FROM pg_database;

Multiply the result by eight.

If we don't have an existing database, we should use a size estimate of the database install after three years. Refer to the Sizing storage recipe to obtain this estimate. Then, perform the following steps:

Divide the current or estimated database storage size by ten to obtain the minimum amount of memory.
Multiply our ideal CPU chip count by four to get the memory module count.
Divide the minimum memory amount by the module count to get the minimum module size.
Round up to the nearest available memory module size.

How it works...

The important part of this recipe is starting with a viable estimate of the database size. Since a lack of RAM won't cause the database to crash or operate improperly, we can use looser guidelines to obtain this number. Hence, three years down the road, an existing database install could be eight times larger than its current size.

Why do we then divide that number by ten? Our goal here is to maximize the benefit of the OS-level cache, which will consume a majority of our RAM. This estimate gives us a value that is ten times smaller than the space our database consumes. At this scale, data that is frequently fetched from disk is likely to be served from memory instead. The alternative is read latency due to insufficient memory for disk caching.

Most current CPUs are quad-channel, and thus operate best when the number of modules per processor is a multiple of four. Since we should have determined how many processor cores would be ideal for our system in the Picking a processor recipe, we automatically know the most efficient memory module count. Why do we multiply by four, regardless of how many memory channels the CPU has? Adding more memory modules is not wasted on chips with fewer channels, and provides a possible upgrade path.

Dividing the memory amount by the module count gives our minimum module size. RAM comes in many dimensions, and our calculation is not likely to match any of the available dimensions for purchase, so we need to round up. Why not round down? The operating system will utilize all available RAM to cache and buffer important data. Unless the greater amount is extremely expensive in comparison, any excess memory will not be wasted.

There's more...

We didn't focus on memory speed, timings, or latency here. Timing and latency can affect performance, but our primary focus is stability. We're always free to order faster or better memory as our budget allows.

Memory speed, on the other hand, is a more visible factor. Every memory speed works with a multiplier to match the highest compatible motherboard bus speed. This directly controls how quickly the CPU can utilize available RAM. Before buying memory, research the stated clock speed and try to match it with one of the faster settings compatible with both the CPU and motherboard.

For example, DDR3-1600 is twice as fast as DDR3-800 since it operates at 200 MHz, as opposed to 100 MHz. Database benchmarks would be vastly different between these two memory speeds, even with the same CPU. Fast memory means PostgreSQL can make more immediate use of cached data, and produce results more quickly.

PostgreSQL High Availability Cookbook - Second Edition

By : Shaun Thomas

PostgreSQL High Availability Cookbook - Second Edition

By: Shaun Thomas

Overview of this book

Related Content you might be interested in

Current Title:

PostgreSQL High Availability Cookbook - Second Edition

PostgreSQL 11 Administration Cookbook

PostgreSQL 13 Cookbook

PostgreSQL 14 Administration Cookbook

Making the most of memory

Getting ready

How to do it...

How it works...

There's more...