PostgreSQL 16 Administration Cookbook

By : Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

5 (1)

Buy this Book

PostgreSQL 16 Administration Cookbook

5 (1)

By: Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

Buy this Book

Overview of this book

PostgreSQL has seen a huge increase in its customer base in the past few years and is becoming one of the go-to solutions for anyone who has a database-specific challenge. This PostgreSQL book touches on all the fundamentals of Database Administration in a problem-solution format. It is intended to be the perfect desk reference guide. This new edition focuses on recipes based on the new PostgreSQL 16 release. The additions include handling complex batch loading scenarios with the SQL MERGE statement, security improvements, running Postgres on Kubernetes or with TPA and Ansible, and more. This edition also focuses on certain performance gains, such as query optimization, and the acceleration of specific operations, such as sort. It will help you understand roles, ensuring high availability, concurrency, and replication. It also draws your attention to aspects like validating backups, recovery, monitoring, and scaling aspects. This book will act as a one-stop solution to all your real-world database administration challenges. By the end of this book, you will be able to manage, monitor, and replicate your PostgreSQL 16 database for efficient administration and maintenance with the best practices from experts.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

First Steps

Introducing PostgreSQL 16

How to get PostgreSQL

Connecting to the PostgreSQL server

Enabling access for network/remote users

Using the pgAdmin 4 GUI tool

Using the psql query and scripting tool

Changing your password securely

Avoiding hardcoding your password

Using a connection service file

Troubleshooting a failed connection

PostgreSQL in the cloud

PostgreSQL with Kubernetes

PostgreSQL with TPA

Exploring the Database

What type of server is this?

What version is the server?

What is the server uptime?

Locating the database server files

Locating the database server’s message log

Locating the database’s system identifier

Listing databases on the database server

How many tables are there in a database?

How much disk space does a database use?

How much memory does a database currently use?

How much disk space does a table use?

Which are my biggest tables?

How many rows are there in a table?

Quickly estimating the number of rows in a table

Listing extensions in this database

Understanding object dependencies

Server Configuration

Read the fine manual (RTFM)

Planning a new database

Setting the configuration parameters for the database server

Setting the configuration parameters in your programs

Finding the configuration settings for your session

Finding parameters with non-default settings

Setting parameters for particular groups of users

A basic server configuration checklist

Adding an external module to PostgreSQL

Using an installed module/extension

Managing installed extensions

Server Control

An overview of controlling the database server

Starting the database server manually

Stopping the server safely and quickly

Stopping the server in an emergency

Reloading server configuration files

Restarting the server quickly

Preventing new connections

Restricting users to only one session each

Pushing users off the system

Deciding on a design for multitenancy

Using multiple schemas

Giving users their own private databases

Running multiple servers on one system

Setting up a connection pool

Accessing multiple servers using the same host and port

Running multiple PgBouncer on the same port to leverage multiple cores

Tables and Data

Choosing good names for database objects

Handling objects with quoted names

Identifying and removing duplicates

Preventing duplicate rows

Finding a unique key for a set of data

Generating test data

Randomly sampling data

Loading data from a spreadsheet

Loading data from flat files

Making bulk data changes using server-side procedures with transactions

Dealing with large tables with table partitioning

Finding good candidates for partition keys

Consolidating data with MERGE

Deciding when to use JSON data types

Security

An overview of PostgreSQL security

The PostgreSQL superuser

Revoking user access to tables

Granting user access to a table

Granting user access to specific columns

Granting user access to specific rows

Creating a new user

Temporarily preventing a user from connecting

Removing a user without dropping their data

Checking whether all users have a secure password

Giving limited superuser powers to specific users

Auditing database access

Always knowing which user is logged in

Integrating with LDAP

Connecting using encryption (SSL / GSSAPI)

Using SSL certificates to authenticate

Mapping external usernames to database roles

Using column-level encryption

Setting up cloud security using predefined roles

Database Administration

Writing a script that either succeeds entirely or fails entirely

Writing a psql script that exits on the first error

Using psql variables

Placing query output into psql variables

Writing a conditional psql script

Investigating a psql error

Setting the psql prompt with useful information

Using pgAdmin for DBA tasks

Scheduling jobs for regular background execution

Performing actions on many tables

Adding/removing columns on a table

Changing the data type of a column

Changing the definition of an enum data type

Adding a constraint concurrently

Adding/removing schemas

Moving objects between schemas

Adding/removing tablespaces

Moving objects between tablespaces

Accessing objects in other PostgreSQL databases

Accessing objects in other foreign databases

Making views updatable

Using materialized views

Using GENERATED data columns

Using data compression

Monitoring and Diagnosis

Cloud-native monitoring

Providing PostgreSQL information to monitoring tools

Real-time viewing using pgAdmin

Monitoring the PostgreSQL message log

Checking whether a user is connected

Checking whether a computer is connected

Repeatedly executing a query in psql

Checking which queries are running

Monitoring the progress of commands

Checking which queries are active or blocked

Knowing who is blocking a query

Killing a specific session

Knowing whether anybody is using a specific table

Knowing when a table was last used

Monitoring I/O statistics

Usage of disk space by temporary data

Understanding why queries slow down

Analyzing the real-time performance of your queries

Tracking important metrics over time

Regular Maintenance

Controlling automatic database maintenance

Avoiding auto-freezing

Removing issues that cause bloat

Actions for heavy users of temporary tables

Identifying and fixing bloated tables and indexes

Monitoring and tuning a vacuum

Maintaining indexes

Finding unused indexes

Carefully removing unwanted indexes

Planning maintenance

Performance and Concurrency

Finding slow SQL statements

Finding out what makes SQL slow

Reducing the number of rows returned

Simplifying complex SQL queries

Speeding up queries without rewriting them

Discovering why a query is not using an index

Forcing a query to use an index

Using parallel query

Using Just-In-Time (JIT) compilation

Creating time-series tables using partitioning

Using optimistic locking to avoid long lock waits

Reporting performance problems

Backup and Recovery

Understanding and controlling crash recovery

Planning your backups

Hot logical backup of one database

Hot logical backup of all databases

Backup of database object definitions

A standalone hot physical backup

Hot physical backups with Barman

Recovery of all databases

Recovery to a point in time

Recovery of a dropped/damaged table

Recovery of a dropped/damaged database

Extracting a logical backup from a physical one

Improving the performance of logical backup/recovery

Improving the performance of physical backup/recovery

Validating backups

Replication and Upgrades

Replication concepts

Replication best practices

Setting up streaming replication

Setting up streaming replication security

Hot Standby and read scalability

Managing streaming replication

Using repmgr

Using replication slots

Setting up replication with TPA

Setting up replication with CloudNativePG

Monitoring replication

Performance and synchronous replication (sync rep)

Delaying, pausing, and synchronizing replication

Logical replication

EDB Postgres Distributed

Archiving transaction log data

Upgrading minor releases

Major upgrades in-place

Major upgrades online

Other Books You May Enjoy

Index

Customer Reviews

5 (1)

5 star

100%

4 star

3 star

2 star

1 star

PostgreSQL with Kubernetes

In this recipe, we discuss Kubernetes (K8s for short), the industry’s most prominent solution for automated application deployment, scaling, and management. It is free software, vendor neutral, and maintained by the Cloud Native Computing Foundation (CNCF).

CloudNativePG (CNPG) is the newest and fastest-rising Kubernetes operator for PostgreSQL. In other words, it provides automation around the entire Postgres lifecycle, taking care of deployment, scaling, and the management of database clusters.

In this recipe, we’ll use Minikube, a lightweight and fuss-free Kubernetes distribution for testing software deployment. It’s not suitable for production usage, but whatever we do in Minikube also holds true for any Kubernetes cluster, so you can take what you learn here and apply it to production-ready clusters.

Getting ready

First off, we install Minikube to provide a minimal Kubernetes cluster. Install Docker (or Podman) from your OS’s default package manager, then visit https://minikube.sigs.k8s.io/docs/start/ to find download and installation instructions for your operating system and architecture. For example, if you use Debian, then the installation is as simple as:

curl -LO \https://storage.googleapis.com/minikube/releases/latest/minikube_latest_amd64.deb
sudo dpkg -i minikube_latest_amd64.deb

Next, assuming that your user has permission to use Docker, you can start Minikube with:

minikube start

At this point, you can install the kubectl utility, which lets you interact with the Kubernetes cluster:

minikube kubectl -- get pods -A

The above command is a bit verbose; you can wrap it in a shorter alias:

alias kubectl="minikube kubectl --"

Now everything should be ready; you can verify that by running:

kubectl get nodes
NAME       STATUS   ROLES           AGE   VERSION
minikube   Ready    control-plane   12m   v1.27.4

which means that you’re ready to start your CloudNativePG journey.

How to do it...

In order to install the latest version (at the time of writing, v1.21.0) of the CloudNativePG operator into your Kubernetes cluster, run:

kubectl apply -f \
  https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.21/releases/cnpg-1.21.0.yaml

We verify the installation with:

kubectl get deployment -n cnpg-system cnpg-controller-manager
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
cnpg-controller-manager   0/1     1            0           15s

Let’s deploy a sample PostgreSQL cluster.

Kubernetes works in a declarative way: you declare what the cluster should look like, and then CNPG (the operator) will perform all the necessary operations that will end up with the cluster in the exact state that you declared.

In practice, we create a YAML file called sample-cluster.yaml with the following content:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: sample-cluster
spec:
  instances: 3
  storage:
    size: 1Gi

And then we apply that file by running:

kubectl apply -f sample-cluster.yaml

We can check what is going on by seeing which Postgres pods are up and running:

kubectl get pods
NAME                            READY   STATUS            RESTARTS   AGE
sample-cluster-1-initdb-74xf7   0/1     PodInitializing   0          30s

Looks like we’re not done yet. Give it a moment, and then you will see:

kubectl get pods
NAME               READY   STATUS    RESTARTS   AGE
sample-cluster-1   1/1     Running   0          2m19s
sample-cluster-2   1/1     Running   0          1m41s
sample-cluster-3   1/1     Running   0          1m12s

Our Postgres nodes are up! They are now ready to be accessed by applications running inside the Kubernetes cluster by connecting to the following Services created by CNPG:

kubectl get svc
NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
kubernetes          ClusterIP   10.96.0.1       <none>        443/TCP    77m
sample-cluster-r    ClusterIP   10.101.133.29   <none>        5432/TCP   42m
sample-cluster-ro   ClusterIP   10.100.24.250   <none>        5432/TCP   42m
sample-cluster-rw   ClusterIP   10.99.79.108    <none>        5432/TCP   42m

The sample-cluster-rw Service lets you connect to the primary node for read/write operations, sample-cluster-ro to standbys only for read-only operations, and sample-cluster-r to any node (including the primary) for read operations.

You can find more sample configurations with more features at https://cloudnative-pg.io/documentation/current/samples/.

How it works…

The operator defines a new Kubernetes resource called Cluster, representing a PostgreSQL cluster made up of a single primary and an optional number of physical replicas that co-exist in the chosen Kubernetes namespace for high availability and offloading of read-only queries.

Applications in the Kubernetes cluster can now access the Postgres database through the Service that the operator manages, without worrying about which node is primary and whether the primary changes due to a failover or switchover. For applications from outside the Kubernetes cluster, you need to expose Postgres via TCP by configuring a Service or Ingress object.

In our cluster, 1 GB of disk space was allocated for Postgres in the default Kubernetes storage. Be aware that we deployed Postgres with the default configuration, which is conservative and safe for testing on a laptop, but definitely not suitable for production usage.

You can find CNPG’s extensive documentation, which describes all you can do with the operator, including detailed Prometheus monitoring, backup and recovery, upgrades, migration, scaling, etc., and how to configure it for production use, at https://cloudnative-pg.io/documentation/current/.

There’s more...

CloudNativePG is able to react to the failure of a PostgreSQL instance by performing failover and/or creating new replicas, depending on what is needed to restore the desired state, which in our example is one primary node and two physical replicas.

We recommend this method for Kubernetes PostgreSQL deployments because it is not an attempt to shoehorn Postgres into Kubernetes with additional sidecar software to take care of the high availability aspect. It is built from the ground up with Postgres-specific resources, while respecting the cloud-native declarative conventions and using Kubernetes’s built-in facilities and features.

High availability has historically been a complex subject for PostgreSQL, as for other database systems, because the most difficult part is to diagnose failures correctly. The various middleware tools – for which we refer you to Chapter 12, Replication and Upgrades – employ a number of techniques to reduce the risk of doing the wrong thing due to a mistaken diagnosis.

Kubernetes changes the way high availability is achieved because it provides a very reliable interface for detecting node failures. CNPG is called “native” because it follows this approach strictly, and as a result it is becoming very popular in the Kubernetes world, probably also because people who are experienced with Kubernetes will recognize this approach as familiar and reliable.

CloudNativePG is the first PostgreSQL-related project to aim for CNCF certification through the Sandbox/Incubation/Graduation process. You can find the CNPG repository at https://github.com/cloudnative-pg/cloudnative-pg.

PostgreSQL 16 Administration Cookbook

By : Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

PostgreSQL 16 Administration Cookbook

By: Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

Overview of this book

Related Content you might be interested in

Current Title:

PostgreSQL 16 Administration Cookbook

Learn PostgreSQL

PostgreSQL 10 High Performance

Learning PostgreSQL 11

PostgreSQL with Kubernetes

Getting ready

How to do it...

How it works…

There’s more...