Book Image

Instant PostgreSQL Backup and Restore How-to

Book Image

Instant PostgreSQL Backup and Restore How-to

Overview of this book

Backing up a database is something that all system admins must eventually face. Larger warehouses and critical enterprise data requires advanced techniques and tools to handle this complex data. PostgreSQL provides you with the basic tools for accomplishing this complex task. "Instant PostgreSQL Backup and Restore How-to" is a practical series of proven recipes showing you how to preserve critical business data, and also teach you some advanced methods for restoring this data. It is a perfect manual for managing your critical PostgreSQL data. Instant PostgreSQL Backup and Restore How-to is your practical guide to the unique features that PostgreSQL offers to create quick backups and efficient restores. Starting by backing up a simplistic database and learning how-to restore it to working order, we then move on to more complex backup techniques including creating and restoring binary backups. Covering advanced topics such as Warm and Hot standby restore and Streaming replication, Instant PostgreSQL Backup and Restore How-to gives you the power to make complete backups to guarantee you will always be able to restore your database to full working order!
Table of Contents (7 chapters)

Getting a basic export (Simple)


We will start with pg_dumpall, the most basic PostgreSQL backup tool. This single command-line utility can export the entire database instance at once. We want to start with this particular command, because it preserves important information such as users, roles, and passwords. Later, we will only use it to obtain this important metadata.

Getting ready

Before we begin backing up our database, we should have a database! Since we have installed both PostgreSQL and the Contrib tools, we should have everything we need to get started with. To make things easier, we will export a single environment variable to run all commands as the postgres user. This user owns the database instance in default installs. Exporting this variable lets you act as the postgres user for all future examples. Later, we can use the createdb utility to create an empty database for our backup experimentation. The pgbench utility will be our source of data, as shown in the following code snippet, since backing up an empty database is hard to verify upon restore:

$> export PGUSER=postgres
$> createdb sample
$> pgbench -i -s 50 sample

Now we have a database named sample with several tables full of generated data. Since the default row count for the tool is 100,000, a scale of 50 provides a table with five million rows that we can use to verify the backup processing time required. We can also verify the restored database by checking for the existence of the generated tables and their content. If this scale is too large, feel free to use a smaller scale to follow along.

Tip

The sample database will be the basis for all subsequent data export examples. Unless otherwise specified, it always starts with a fresh database. Again, you can use the suggested scale size here, or choose your own.

How to do it...

Creating a backup this way requires a single command, as follows:

  1. Make sure you have opened a Command Prompt console as a local user on your Linux system, and type the following command:

    $> pg_dumpall -f backup.sql
    

How it works...

The pg_dumpall utility produces what should be a full copy of all database objects including users, schemas, and data, as a single very large SQL file. Our example directed the SQL output to a file named backup.sql, but any name is valid, so long as we can remember it later.

There's more...

Though the venerable --help command-line switch always lists the full capabilities available to us, the more important of these deserve more discussion.

Restoring the export

Before we get much further, we should quickly explain how to restore the SQL file you just produced. Our other recipes are more complex and require separate sections, but restoring a pg_dumpall export is very easy. The psql command is used for running SQL files. Since this is just a SQL file, you can run it directly against the database. Try the following:

$> psql -f backup.sql postgres

The -f switch tells PostgreSQL that we want to execute our backup file against the database postgres, which is a placeholder. The psql command expects a database name, so we have provided a simple default. Your backup will still restore properly, for example, creating and filling the sample database. This is because the backup also contains database creation commands and more commands to change database targets so that all data goes where it should. Like we said, this is the easiest backup method PostgreSQL has.

Exporting global objects

Though the SQL export itself is perfectly valid for restore, many administrators prefer to use the pg_dumpall export to obtain the globally stored objects such as users, roles, and passwords only, and use other tools for things such as tables and other data. To get this global data alone, the -g switch tells pg_dumpall that is all we wanted. Type the following command to get only global objects:

$> pg_dumpall -g -f globals.sql

We will be using the previous command frequently for just getting global objects.

Compressed backups

Unfortunately pg_dumpall cannot directly compress its output; it is a very basic tool. If we have an extremely large database, other UNIX commands will also be necessary. For example, the following command will compress the dump using a parallel algorithm while it is being produced, to greatly reduce backup time and size:

$> pg_dumpall | gzip > backup.sql.gz

Naming backups

Note that in all of our examples thus far, we have named the backup rather poorly. It is a better practice to use the -f switch to provide a name that follows a specific naming scheme. Backup files should always include at least one contextual clue, the date on which the backup was taken, and possibly the time. The following is a better example:

$> pg_dumpall -f production_2013-02-15.sql