PostgreSQL 16 Administration Cookbook

By : Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

5 (1)

Buy this Book

PostgreSQL 16 Administration Cookbook

5 (1)

By: Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

Buy this Book

Overview of this book

PostgreSQL has seen a huge increase in its customer base in the past few years and is becoming one of the go-to solutions for anyone who has a database-specific challenge. This PostgreSQL book touches on all the fundamentals of Database Administration in a problem-solution format. It is intended to be the perfect desk reference guide. This new edition focuses on recipes based on the new PostgreSQL 16 release. The additions include handling complex batch loading scenarios with the SQL MERGE statement, security improvements, running Postgres on Kubernetes or with TPA and Ansible, and more. This edition also focuses on certain performance gains, such as query optimization, and the acceleration of specific operations, such as sort. It will help you understand roles, ensuring high availability, concurrency, and replication. It also draws your attention to aspects like validating backups, recovery, monitoring, and scaling aspects. This book will act as a one-stop solution to all your real-world database administration challenges. By the end of this book, you will be able to manage, monitor, and replicate your PostgreSQL 16 database for efficient administration and maintenance with the best practices from experts.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

First Steps

Introducing PostgreSQL 16

How to get PostgreSQL

Connecting to the PostgreSQL server

Enabling access for network/remote users

Using the pgAdmin 4 GUI tool

Using the psql query and scripting tool

Changing your password securely

Avoiding hardcoding your password

Using a connection service file

Troubleshooting a failed connection

PostgreSQL in the cloud

PostgreSQL with Kubernetes

PostgreSQL with TPA

Exploring the Database

What type of server is this?

What version is the server?

What is the server uptime?

Locating the database server files

Locating the database server’s message log

Locating the database’s system identifier

Listing databases on the database server

How many tables are there in a database?

How much disk space does a database use?

How much memory does a database currently use?

How much disk space does a table use?

Which are my biggest tables?

How many rows are there in a table?

Quickly estimating the number of rows in a table

Listing extensions in this database

Understanding object dependencies

Server Configuration

Read the fine manual (RTFM)

Planning a new database

Setting the configuration parameters for the database server

Setting the configuration parameters in your programs

Finding the configuration settings for your session

Finding parameters with non-default settings

Setting parameters for particular groups of users

A basic server configuration checklist

Adding an external module to PostgreSQL

Using an installed module/extension

Managing installed extensions

Server Control

An overview of controlling the database server

Starting the database server manually

Stopping the server safely and quickly

Stopping the server in an emergency

Reloading server configuration files

Restarting the server quickly

Preventing new connections

Restricting users to only one session each

Pushing users off the system

Deciding on a design for multitenancy

Using multiple schemas

Giving users their own private databases

Running multiple servers on one system

Setting up a connection pool

Accessing multiple servers using the same host and port

Running multiple PgBouncer on the same port to leverage multiple cores

Tables and Data

Choosing good names for database objects

Handling objects with quoted names

Identifying and removing duplicates

Preventing duplicate rows

Finding a unique key for a set of data

Generating test data

Randomly sampling data

Loading data from a spreadsheet

Loading data from flat files

Making bulk data changes using server-side procedures with transactions

Dealing with large tables with table partitioning

Finding good candidates for partition keys

Consolidating data with MERGE

Deciding when to use JSON data types

Security

An overview of PostgreSQL security

The PostgreSQL superuser

Revoking user access to tables

Granting user access to a table

Granting user access to specific columns

Granting user access to specific rows

Creating a new user

Temporarily preventing a user from connecting

Removing a user without dropping their data

Checking whether all users have a secure password

Giving limited superuser powers to specific users

Auditing database access

Always knowing which user is logged in

Integrating with LDAP

Connecting using encryption (SSL / GSSAPI)

Using SSL certificates to authenticate

Mapping external usernames to database roles

Using column-level encryption

Setting up cloud security using predefined roles

Database Administration

Writing a script that either succeeds entirely or fails entirely

Writing a psql script that exits on the first error

Using psql variables

Placing query output into psql variables

Writing a conditional psql script

Investigating a psql error

Setting the psql prompt with useful information

Using pgAdmin for DBA tasks

Scheduling jobs for regular background execution

Performing actions on many tables

Adding/removing columns on a table

Changing the data type of a column

Changing the definition of an enum data type

Adding a constraint concurrently

Adding/removing schemas

Moving objects between schemas

Adding/removing tablespaces

Moving objects between tablespaces

Accessing objects in other PostgreSQL databases

Accessing objects in other foreign databases

Making views updatable

Using materialized views

Using GENERATED data columns

Using data compression

Monitoring and Diagnosis

Cloud-native monitoring

Providing PostgreSQL information to monitoring tools

Real-time viewing using pgAdmin

Monitoring the PostgreSQL message log

Checking whether a user is connected

Checking whether a computer is connected

Repeatedly executing a query in psql

Checking which queries are running

Monitoring the progress of commands

Checking which queries are active or blocked

Knowing who is blocking a query

Killing a specific session

Knowing whether anybody is using a specific table

Knowing when a table was last used

Monitoring I/O statistics

Usage of disk space by temporary data

Understanding why queries slow down

Analyzing the real-time performance of your queries

Tracking important metrics over time

Regular Maintenance

Controlling automatic database maintenance

Avoiding auto-freezing

Removing issues that cause bloat

Actions for heavy users of temporary tables

Identifying and fixing bloated tables and indexes

Monitoring and tuning a vacuum

Maintaining indexes

Finding unused indexes

Carefully removing unwanted indexes

Planning maintenance

Performance and Concurrency

Finding slow SQL statements

Finding out what makes SQL slow

Reducing the number of rows returned

Simplifying complex SQL queries

Speeding up queries without rewriting them

Discovering why a query is not using an index

Forcing a query to use an index

Using parallel query

Using Just-In-Time (JIT) compilation

Creating time-series tables using partitioning

Using optimistic locking to avoid long lock waits

Reporting performance problems

Backup and Recovery

Understanding and controlling crash recovery

Planning your backups

Hot logical backup of one database

Hot logical backup of all databases

Backup of database object definitions

A standalone hot physical backup

Hot physical backups with Barman

Recovery of all databases

Recovery to a point in time

Recovery of a dropped/damaged table

Recovery of a dropped/damaged database

Extracting a logical backup from a physical one

Improving the performance of logical backup/recovery

Improving the performance of physical backup/recovery

Validating backups

Replication and Upgrades

Replication concepts

Replication best practices

Setting up streaming replication

Setting up streaming replication security

Hot Standby and read scalability

Managing streaming replication

Using repmgr

Using replication slots

Setting up replication with TPA

Setting up replication with CloudNativePG

Monitoring replication

Performance and synchronous replication (sync rep)

Delaying, pausing, and synchronizing replication

Logical replication

EDB Postgres Distributed

Archiving transaction log data

Upgrading minor releases

Major upgrades in-place

Major upgrades online

Other Books You May Enjoy

Index

Customer Reviews

5 (1)

5 star

100%

4 star

3 star

2 star

1 star

Connecting to the PostgreSQL server

How do we access PostgreSQL?

Connecting to the database is the first experience of PostgreSQL for most people, so we want to make it a good one. Let’s do it now and fix any problems we have along the way. Remember that a connection needs to be made secure, so there may be some hoops for us to jump through to ensure that the data we wish to access is secure.

Before we can execute commands against the database, we need to connect to the database server to give us a session.

Sessions are designed to be long-lived, so you connect once, perform many requests, and eventually disconnect. There is a small overhead during the connection. It may become noticeable if you connect and disconnect repeatedly, so you may wish to investigate the use of connection pools. Connection pools allow pre-connected sessions to be quickly served to you when you wish to reconnect. We will discuss them in Chapter 4, Server Control.

Getting ready

First, cache your database. If you don’t know where it is, you’ll probably have difficulty accessing it. There may be more than one database, and you’ll need to know the right one to access and have the authority to connect to it.

You need to specify the following parameters to connect to PostgreSQL:

A host or host address
A port
A database name
A user
A password (or other means of authentication; but only if requested)

To connect, there must be a PostgreSQL server running on that host and listening to the port with that number. On that server, a database and a user with the specified names must also exist. Furthermore, the host must explicitly allow connections from your client (as explained in the Enabling access for network/remote users recipe), and you must also pass the authentication step using the method the server specifies – for example, specifying a password won’t work if the server has requested a different form of authentication. Note that you might not need to provide a password at all if PostgreSQL can recognize that your user is already authenticated by the OS; this is called peer authentication. After showing an example in this recipe, we will discuss it fully in the next recipe: Enabling access for network/remote users (despite not being a network/remote connection method).

Almost all PostgreSQL interfaces use the libpq interface library. When using libpq, most of the connection parameter handling is identical, so we can discuss that just once.

If you don’t specify the preceding parameters, PostgreSQL looks for values set through environment variables, which are as follows:

PGHOST or PGHOSTADDR
PGPORT (set this to 5432 if it is not set already)
PGDATABASE
PGUSER
PGPASSWORD (this is definitely not recommended by us, nor by the PostgreSQL documentation, even if it still exists)

If you somehow specify the first four parameters but not the password, PostgreSQL looks for a password file, as discussed in the Avoiding hardcoding your password recipe.

Some PostgreSQL interfaces use the client-server protocol directly, so the ways in which the defaults are handled may differ. The information we need to supply won’t vary significantly, so check the exact syntax for that interface.

Connection details can also be specified using a connection string, as in this example:

psql "user=myuser host=myhost port=5432 dbname=mydb password=mypasswd"

or alternatively using a Uniform Resource Identifier (URI) format, as follows:

psql postgresql://myuser:mypasswd@myhost:5432/mydb

Both examples specify that we will connect the psql client application to the PostgreSQL server at the myhost host, on port 5432, with the database name mydb, user myuser and password mypasswd.

Note

If you do not specify mypasswd in the preceding URI, you may be prompted to enter the password.

How to do it...

In this example, Afroditi is a database administrator who needs to connect to PostgreSQL to perform some maintenance activities. She can SSH to the database server using her own username afroditi, and DBAs are given sudo privileges to become the postgres user, so she can simply launch psql as the postgres user:

afroditi@dbserver1:~$ sudo -u postgres psql
psql (16.0 (Debian 16.0-1.pgdg120+1))
Type "help" for help.
postgres=#

Note that psql was launched as the postgres user, so it used the postgres user for the database connection, and that psql on Linux attempts a Unix socket connection by default. Hence, this matches peer authentication.

How it works…

PostgreSQL is a client-server database. The system it runs on is known as the host. We can access the PostgreSQL server remotely, through the network. However, we must specify host, which is a hostname, or hostaddr, which is an IP address. We can specify a host as localhost if we wish to make a TCP/IP connection to the same system. Rather than using TCP/IP to localhost, it is usually better to use a Unix socket connection, which is attempted if the host begins with a slash (/) and the name is presumed to be a directory name (the default is /tmp).

On any system, there can be more than one database server. Each database server listens to exactly one well-known network port, which cannot be shared between servers on the same system. The default port number for PostgreSQL is 5432, which has been registered with the Internet Assigned Numbers Authority (IANA) and is uniquely assigned to PostgreSQL (you can see it used in the /etc/services file on most *nix servers). The port number can be used to uniquely identify a specific database server, if any exist. IANA (http://www.iana.org) is the organization that coordinates the allocation of available numbers for various internet protocols.

A database server is also sometimes known as a database cluster because the PostgreSQL server allows you to define one or more databases on each server. Each connection request must identify exactly one database, identified by its dbname. When you connect, you will only be able to see the database objects created within that database.

A database user is used to identify the connection. By default, there is no limit on the number of connections for a particular user. In the Enabling access for network/remote users recipe, we will cover how to restrict that. In more recent versions of PostgreSQL, users are referred to as login roles, although many clues remind us of the earlier nomenclature, and that still makes sense in many ways. A login role is a role that has been assigned the CONNECT privilege.

Each connection will typically be authenticated in some way. This is defined at the server level: client authentication will not be optional at connection time if the administrator has configured the server to require it.

Once you’ve connected, each connection can have one active transaction at a time and one fully active statement at any time.

The server will have a defined limit on the number of connections it can serve, so a connection request can be refused if the server is oversubscribed.

There’s more…

If you are already connected to a database server with psql and you want to confirm that you’ve connected to the right place and in the right way, you can execute some, or all, of the following commands. Here is the command that shows the current_database:

SELECT current_database();

The following command shows the current_user ID:

SELECT current_user;

The next command shows the IP address and port of the current connection, unless you are using Unix sockets, in which case both values are NULL:

SELECT inet_server_addr(), inet_server_port();

A user’s password is not accessible using general SQL, for obvious reasons.

You may also need the following:

SELECT version();

This is just one of several ways to check the database software version; please refer to the What version is the server? recipe in Chapter 2, Exploring the Database. You can also use the new psql meta-command, \conninfo. This displays most of the preceding information in a single line:

postgres=# \conninfo
You are connected to database postgres, as user postgres, via socket in /var/run/postgresql, at port 5432.

PostgreSQL 16 Administration Cookbook

By : Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

PostgreSQL 16 Administration Cookbook

By: Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

Overview of this book

Related Content you might be interested in

Current Title:

PostgreSQL 16 Administration Cookbook

Learn PostgreSQL

PostgreSQL 10 High Performance

Learning PostgreSQL 11

Connecting to the PostgreSQL server

Getting ready

How to do it...

How it works…

There’s more…

See also