PostgreSQL 9 Administration Cookbook - Second Edition

Book Image

PostgreSQL 9 Administration Cookbook - Second Edition

Book Image

PostgreSQL 9 Administration Cookbook - Second Edition

Overview of this book

PostgreSQL 9 Administration Cookbook Second Edition

PostgreSQL 9 Administration Cookbook Second Edition

Credits

About the Authors

About the Authors

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Preface

Free Chapter

First Steps

Getting PostgreSQL

Connecting to the PostgreSQL server

Enabling access for network/remote users

Using graphical administration tools

Using the psql query and scripting tool

Changing your password securely

Avoiding hardcoding your password

Using a connection service file

Troubleshooting a failed connection

Exploring the Database

Exploring the Database

What version is the server?

What is the server uptime?

Locating the database server files

Locating the database server's message log

Locating the database's system identifier

Listing databases on this database server

How many tables in a database?

How much disk space does a database use?

How much disk space does a table use?

Which are my biggest tables?

How many rows in a table?

Quickly estimating the number of rows in a table

Listing extensions in this database

Understanding object dependencies

Configuration

Reading The Fine Manual (RTFM)

Planning a new database

Changing parameters in your programs

Finding the current configuration settings

Which parameters are at nondefault settings?

Updating the parameter file

Setting parameters for particular groups of users

The basic server configuration checklist

Adding an external module to PostgreSQL

Using an installed module

Managing installed extensions

Server Control

Starting the database server manually

Stopping the server safely and quickly

Stopping the server in an emergency

Reloading the server configuration files

Restarting the server quickly

Preventing new connections

Restricting users to only one session each

Pushing users off the system

Deciding on a design for multitenancy

Using multiple schemas

Giving users their own private database

Running multiple servers on one system

Setting up a connection pool

Accessing multiple servers using the same host and port

Tables and Data

Tables and Data

Choosing good names for database objects

Handling objects with quoted names

Enforcing the same name and definition for columns

Identifying and removing duplicates

Preventing duplicate rows

Finding a unique key for a set of data

Generating test data

Randomly sampling data

Loading data from a spreadsheet

Loading data from flat files

Security

The PostgreSQL superuser

Revoking user access to a table

Granting user access to a table

Creating a new user

Temporarily preventing a user from connecting

Removing a user without dropping their data

Checking whether all users have a secure password

Giving limited superuser powers to specific users

Auditing DDL changes

Auditing data changes

Always knowing which user is logged in

Integrating with LDAP

Connecting using SSL

Using SSL certificates to authenticate the client

Mapping external usernames to database roles

Encrypting sensitive data

Database Administration

Database Administration

Writing a script that either succeeds entirely or fails entirely

Writing a psql script that exits on the first error

Performing actions on many tables

Adding/removing columns on a table

Changing the data type of a column

Changing the definition of a data type

Adding/removing schemas

Moving objects between schemas

Adding/removing tablespaces

Moving objects between tablespaces

Accessing objects in other PostgreSQL databases

Accessing objects in other foreign databases

Updatable views

Using materialized views

Monitoring and Diagnosis

Monitoring and Diagnosis

Checking whether a user is connected

Checking which queries are running

Checking which queries are active or blocked

Knowing who is blocking a query

Killing a specific session

Detecting an in-doubt prepared transaction

Knowing whether anybody is using a specific table

Knowing when a table was last used

Usage of disk space by temporary data

Understanding why queries slow down

Investigating and reporting a bug

Producing a daily summary of log file errors

Analyzing the real-time performance of your queries

Regular Maintenance

Regular Maintenance

Controlling automatic database maintenance

Avoiding auto-freezing and page corruptions

Avoiding transaction wraparound

Removing old prepared transactions

Actions for heavy users of temporary tables

Identifying and fixing bloated tables and indexes

Maintaining indexes

Adding a constraint without checking existing rows

Finding unused indexes

Carefully removing unwanted indexes

Planning maintenance

Performance and Concurrency

Performance and Concurrency

Finding slow SQL statements

Collecting regular statistics from pg_stat* views

Finding out what makes SQL slow

Reducing the number of rows returned

Simplifying complex SQL queries

Speeding up queries without rewriting them

Why a query is not using an index

Forcing a query to use an index

Using optimistic locking

Reporting performance problems

Backup and Recovery

Backup and Recovery

Understanding and controlling crash recovery

Planning backups

Hot logical backup of one database

Hot logical backup of all databases

Hot logical backup of all tables in a tablespace

Backup of database object definitions

Standalone hot physical database backup

Hot physical backup and continuous archiving

Recovery of all databases

Recovery to a point in time

Recovery of a dropped/damaged table

Recovery of a dropped/damaged tablespace

Recovery of a dropped/damaged database

Improving performance of backup/recovery

Incremental/differential backup and restore

Hot physical backups with Barman

Recovery with Barman

Replication and Upgrades

Replication and Upgrades

Replication best practices

Setting up file-based replication – deprecated

Setting up streaming replication

Setting up streaming replication security

Hot Standby and read scalability

Managing streaming replication

Using Replication Slots

Monitoring replication

Performance and Synchronous Replication

Delaying, pausing, and synchronizing replication

Logical Replication

Bi-Directional Replication

Archiving transaction log data

Upgrading – minor releases

Major upgrades in-place

Major upgrades online

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Randomly sampling data

DBAs may be asked to set up a test server and populate it with test data. Often, that server will be old hardware, possibly with smaller disk sizes. So, the subject of data sampling raises its head.

The purpose of sampling is to reduce the size of the data set and improve the speed of later analysis. Some statisticians are so used to the idea of sampling that they may not even question whether its use is valid or it can cause further complications.

How to do it…

In this section, we will take a random sample of a given collection of data (for example, a given table). First, you should realize that there isn't a simple tool to slice off a sample of your database. It would be neat if there were, but there isn't. You'll need to read all of this to understand why:

We first consider using SQL to derive a sample. Random sampling is actually very simple because we can use the random() SQL function within the WHERE clause. Consider the following example:
```
postgres=# SELECT count...
```