PostgreSQL Administration Cookbook, 9.5/9.6 Edition - Third Edition

Overview of this book

PostgreSQL is a powerful opensource database management system; now recognized as the expert's choice for a wide range of applications, it has an enviable reputation for performance and stability. PostgreSQL provides an integrated feature set comprising relational database features, object-relational, text search, Geographical Info Systems, analytical tools for big data and JSON/XML document management. Starting with short and simple recipes, you will soon dive into core features, such as configuration, server control, tables, and data. You will tackle a variety of problems a database administrator usually encounters, from creating tables to managing views, from improving performance to securing your database, and from using monitoring tools to using storage engines. Recipes based on important topics such as high availability, concurrency, replication, backup and recovery, as well as diagnostics and troubleshooting are also given special importance. By the end of this book, you will have all the knowledge you need to run, manage, and maintain PostgreSQL efficiently.

Preface

What this book covers

What you need for this book

Free Chapter

First Steps

Introduction

Getting PostgreSQL

Connecting to the PostgreSQL server

Enabling access for network/remote users

Using graphical administration tools

Using the psql query and scripting tool

Changing your password securely

Avoiding hardcoding your password

Using a connection service file

Troubleshooting a failed connection

Exploring the Database

Introduction

What version is the server?

What is the server uptime?

Locating the database server files

Locating the database server's message log

Locating the database's system identifier

Listing databases on this database server

How many tables are there in a database?

How much disk space does a database use?

How much disk space does a table use?

Which are my biggest tables?

How many rows are there in a table?

Quickly estimating the number of rows in a table

Listing extensions in this database

Understanding object dependencies

Configuration

Introduction

Reading the fine manual

Planning a new database

Changing parameters in your programs

Finding the current configuration settings

Which parameters are at non-default settings?

Updating the parameter file

Setting parameters for particular groups of users

The basic server configuration checklist

Adding an external module to PostgreSQL

Using an installed module

Managing installed extensions

Server Control

Introduction

Starting the database server manually

Stopping the server safely and quickly

Stopping the server in an emergency

Reloading the server configuration files

Restarting the server quickly

Preventing new connections

Restricting users to only one session each

Pushing users off the system

Deciding on a design for multitenancy

Using multiple schemas

Giving users their own private database

Running multiple servers on one system

Setting up a connection pool

Accessing multiple servers using the same host and port

Tables and Data

Introduction

Choosing good names for database objects

Handling objects with quoted names

Enforcing the same name and definition for columns

Identifying and removing duplicates

Preventing duplicate rows

Finding a unique key for a set of data

Generating test data

Randomly sampling data

Loading data from a spreadsheet

Loading data from flat files

Security

Introduction

The PostgreSQL superuser

Revoking user access to a table

Granting user access to a table

Granting user access to specific columns

Granting user access to specific rows

Creating a new user

Temporarily preventing a user from connecting

Removing a user without dropping their data

Checking whether all users have a secure password

Giving limited superuser powers to specific users

Auditing DDL changes

Auditing data changes

Always knowing which user is logged in

Integrating with LDAP

Connecting using SSL

Using SSL certificates to authenticate

Mapping external usernames to database roles

Encrypting sensitive data

Database Administration

Introduction

Writing a script that either succeeds entirely or fails entirely

Writing a psql script that exits on the first error

Investigating a psql error

Performing actions on many tables

Adding/removing columns on a table

Changing the data type of a column

Changing the definition of a data type

Adding/removing schemas

Moving objects between schemas

Adding/removing tablespaces

Moving objects between tablespaces

Accessing objects in other PostgreSQL databases

Accessing objects in other foreign databases

Updatable views

Using materialized views

Monitoring and Diagnosis

Introduction

Real-time viewing using pgAdmin

Checking whether a user is connected

Checking which queries are running

Checking which queries are active or blocked

Knowing who is blocking a query

Killing a specific session

Detecting an in-doubt prepared transaction

Knowing whether anybody is using a specific table

Knowing when a table was last used

Usage of disk space by temporary data

Understanding why queries slow down

Investigating and reporting a bug

Producing a daily summary of log file errors

Analyzing the real-time performance of your queries

Regular Maintenance

Introduction

Controlling automatic database maintenance

Avoiding auto-freezing and page corruptions

Removing issues that cause bloat

Removing old prepared transactions

Actions for heavy users of temporary tables

Identifying and fixing bloated tables and indexes

Monitoring and tuning vacuum

Maintaining indexes

Adding a constraint without checking existing rows

Finding unused indexes

Carefully removing unwanted indexes

Planning maintenance

Performance and Concurrency

Introduction

Finding slow SQL statements

Collecting regular statistics from pg_stat* views

Finding out what makes SQL slow

Reducing the number of rows returned

Simplifying complex SQL queries

Speeding up queries without rewriting them

Discovering why a query is not using an index

Forcing a query to use an index

Using parallel query

Using optimistic locking

Reporting performance problems

Backup and Recovery

Introduction

Understanding and controlling crash recovery

Planning backups

Hot logical backups of one database

Hot logical backups of all databases

Backups of database object definitions

Standalone hot physical database backup

Hot physical backup and continuous archiving

Recovery of all databases

Recovery to a point in time

Recovery of a dropped/damaged table

Recovery of a dropped/damaged database

Improving performance of backup/recovery

Incremental/differential backup and restore

Hot physical backups with Barman

Recovery with Barman

Replication and Upgrades

Introduction

Replication best practices

Setting up file-based replication - deprecated

Setting up streaming replication

Setting up streaming replication security

Hot Standby and read scalability

Managing streaming replication

Using repmgr

Using replication slots

Monitoring replication

Performance and synchronous replication

Delaying, pausing, and synchronizing replication

Logical replication

Bi-directional replication

Archiving transaction log data

Upgrading - minor releases

Major upgrades in-place

Major upgrades online

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Connecting to the PostgreSQL server

How do we access PostgreSQL?

Connecting to the database is most people's first experience of PostgreSQL, so we want to make it a good one. Let's do it now and fix any problems we have along the way. Remember that a connection needs to be made securely, so there may be some hoops for us to jump through to ensure that the data we wish to access is secure.

Before we can execute commands against the database, we need to connect to the database server, giving us a session.

Sessions are designed to be long-lived, so you connect once, perform many requests, and eventually disconnect. There is a small overhead during connection. It may become noticeable if you connect and disconnect repeatedly, so you may wish to investigate the use of connection pools. Connection pools allow pre-connected sessions to be served quickly to you when you wish to reconnect.

Getting ready

First, catch your database. If you don't know where it is, you'll probably have difficulty accessing it. There may be more than one database, and you'll need to know the right one to access and have the authority to connect to it.

How to do it...

You need to specify the following parameters to connect to PostgreSQL:

Host or host address
Port
Database name
User
Password (or other means of authentication, if any)

To connect, there must be a PostgreSQL server running on host, listening to port number port. On that server, a database named dbname and a user named user must also exist. The host must explicitly allow connections from your client (this is explained in the next recipe), and you must also pass authentication using the method the server specifies; for example, specifying a password won't work if the server has requested a different form of authentication.

Almost all PostgreSQL interfaces use the libpq interface library. When using libpq, most of the connection parameter handling is identical, so we can discuss that just once.

If you don't specify the preceding parameters, PostgreSQL looks for values set through environment variables, which are as follows:

PGHOST or PGHOSTADDR
PGPORT (set this to 5432 if it is not set already)
PGDATABASE
PGUSER
PGPASSWORD (this is definitely not recommended)

If you specify the first four parameters somehow but not the password, then PostgreSQL looks for a password file, discussed in the Avoiding hardcoding your password recipe.

Some PostgreSQL interfaces use the client-server protocol directly, so the way the defaults are handled may differ. The information we need to supply won't vary significantly, so check the exact syntax for that interface.

Starting from PostgreSQL 9.2, connection details can also be specified using a URI format, as follows:

psql postgresql://myuser:mypasswd@myhost:5432/mydb

This specifies that we will connect to PostgreSQL using the myhost host, 5432 port, mydb database name, myuser user, and mypasswd password.

How it works...

PostgreSQL is a client-server database. The system it runs on is known as the host. We can access the PostgreSQL server remotely through the network. However, we must specify host, which is a hostname, or hostaddr, which is an IP address. We can specify a host as localhost if we wish to make a TCP/IP connection to the same system. It is often better to use a Unix socket connection, which is attempted if the host begins with a slash (/) and the name is presumed to be a directory name (default is /tmp).

On any system, there can be more than one database server. Each database server listens to exactly one well-known network port, which cannot be shared between the servers on the same system. The default port number for PostgreSQL is 5432, which has been registered with IANA and is uniquely assigned to PostgreSQL (you can see it used in the /etc/services file on most *nix servers). The port number can be used to uniquely identify a specific database server if many exist. The Internet Assigned Numbers Authority (IANA) (http://www.iana.org), is the organization that coordinates the allocation of available numbers for various Internet protocols.

A database server is also sometimes known as a database cluster, because the PostgreSQL server allows you to define one or more databases on each server. Each connection request must identify exactly one database identified by its dbname. When you connect, you will be able to see only the database objects created within that database.

A database user is used to identify the connection. By default, there is no limit on the number of connections for a particular user; in a later recipe, we will cover how to restrict that. In the more recent versions of PostgreSQL, users are referred to as login roles, though many clues remind us of the earlier nomenclature, and that still makes sense in many ways. A login role is a role that has been assigned the CONNECT privilege.

Each connection will typically be authenticated in some way. This is defined at the server level: client authentication will not be optional at connection time, if the administrator has configured the server to require it.

Once you've connected, each connection can have one active transaction at a time and one fully active statement at any time.

The server will have a defined limit on the number of connections it can serve, so a connection request can be refused if the server is oversubscribed.

There's more...

If you are already connected to a database server with psql and you want to confirm that you've connected to the right place and in the right way, you can execute some, or all, of the following commands. Here is the command that shows the current_database:

SELECT current_database();

The following command shows the current_user ID:

SELECT current_user;

The next command shows the IP address and port of the current connection, unless you are using Unix sockets, in which case both values are NULL:

SELECT inet_server_addr(), inet_server_port();

A user's password is not accessible using general SQL for obvious reasons.

You may also need the following:

SELECT version();

From PostgreSQL version 9.1 onwards, you can also use psql's new meta-command, \conninfo. It displays most of the preceding information in a single line:

postgres=# \conninfo

You are connected to database postgres as user postgres via socket in /var/run/postgresql at port 5432.

PostgreSQL Administration Cookbook, 9.5/9.6 Edition - Third Edition

PostgreSQL Administration Cookbook, 9.5/9.6 Edition - Third Edition

Overview of this book

Related Content you might be interested in

Current Title:

PostgreSQL Administration Cookbook, 9.5/9.6 Edition - Third Edition