Book Image

PostgreSQL 16 Administration Cookbook

By : Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs
5 (1)
Book Image

PostgreSQL 16 Administration Cookbook

5 (1)
By: Gianni Ciolli, Boriss Mejías, Jimmy Angelakos, Vibhor Kumar, Simon Riggs

Overview of this book

PostgreSQL has seen a huge increase in its customer base in the past few years and is becoming one of the go-to solutions for anyone who has a database-specific challenge. This PostgreSQL book touches on all the fundamentals of Database Administration in a problem-solution format. It is intended to be the perfect desk reference guide. This new edition focuses on recipes based on the new PostgreSQL 16 release. The additions include handling complex batch loading scenarios with the SQL MERGE statement, security improvements, running Postgres on Kubernetes or with TPA and Ansible, and more. This edition also focuses on certain performance gains, such as query optimization, and the acceleration of specific operations, such as sort. It will help you understand roles, ensuring high availability, concurrency, and replication. It also draws your attention to aspects like validating backups, recovery, monitoring, and scaling aspects. This book will act as a one-stop solution to all your real-world database administration challenges. By the end of this book, you will be able to manage, monitor, and replicate your PostgreSQL 16 database for efficient administration and maintenance with the best practices from experts.
Table of Contents (15 chapters)
13
Other Books You May Enjoy
14
Index

PostgreSQL with Kubernetes

In this recipe, we discuss Kubernetes (K8s for short), the industry’s most prominent solution for automated application deployment, scaling, and management. It is free software, vendor neutral, and maintained by the Cloud Native Computing Foundation (CNCF).

CloudNativePG (CNPG) is the newest and fastest-rising Kubernetes operator for PostgreSQL. In other words, it provides automation around the entire Postgres lifecycle, taking care of deployment, scaling, and the management of database clusters.

In this recipe, we’ll use Minikube, a lightweight and fuss-free Kubernetes distribution for testing software deployment. It’s not suitable for production usage, but whatever we do in Minikube also holds true for any Kubernetes cluster, so you can take what you learn here and apply it to production-ready clusters.

Getting ready

First off, we install Minikube to provide a minimal Kubernetes cluster. Install Docker (or Podman) from your OS’s default package manager, then visit https://minikube.sigs.k8s.io/docs/start/ to find download and installation instructions for your operating system and architecture. For example, if you use Debian, then the installation is as simple as:

curl -LO \https://storage.googleapis.com/minikube/releases/latest/minikube_latest_amd64.deb
sudo dpkg -i minikube_latest_amd64.deb

Next, assuming that your user has permission to use Docker, you can start Minikube with:

minikube start

At this point, you can install the kubectl utility, which lets you interact with the Kubernetes cluster:

minikube kubectl -- get pods -A

The above command is a bit verbose; you can wrap it in a shorter alias:

alias kubectl="minikube kubectl --"

Now everything should be ready; you can verify that by running:

kubectl get nodes
NAME       STATUS   ROLES           AGE   VERSION
minikube   Ready    control-plane   12m   v1.27.4

which means that you’re ready to start your CloudNativePG journey.

How to do it...

In order to install the latest version (at the time of writing, v1.21.0) of the CloudNativePG operator into your Kubernetes cluster, run:

kubectl apply -f \
  https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.21/releases/cnpg-1.21.0.yaml

We verify the installation with:

kubectl get deployment -n cnpg-system cnpg-controller-manager
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
cnpg-controller-manager   0/1     1            0           15s

Let’s deploy a sample PostgreSQL cluster.

Kubernetes works in a declarative way: you declare what the cluster should look like, and then CNPG (the operator) will perform all the necessary operations that will end up with the cluster in the exact state that you declared.

In practice, we create a YAML file called sample-cluster.yaml with the following content:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: sample-cluster
spec:
  instances: 3
  storage:
    size: 1Gi

And then we apply that file by running:

kubectl apply -f sample-cluster.yaml

We can check what is going on by seeing which Postgres pods are up and running:

kubectl get pods
NAME                            READY   STATUS            RESTARTS   AGE
sample-cluster-1-initdb-74xf7   0/1     PodInitializing   0          30s

Looks like we’re not done yet. Give it a moment, and then you will see:

kubectl get pods
NAME               READY   STATUS    RESTARTS   AGE
sample-cluster-1   1/1     Running   0          2m19s
sample-cluster-2   1/1     Running   0          1m41s
sample-cluster-3   1/1     Running   0          1m12s

Our Postgres nodes are up! They are now ready to be accessed by applications running inside the Kubernetes cluster by connecting to the following Services created by CNPG:

kubectl get svc
NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
kubernetes          ClusterIP   10.96.0.1       <none>        443/TCP    77m
sample-cluster-r    ClusterIP   10.101.133.29   <none>        5432/TCP   42m
sample-cluster-ro   ClusterIP   10.100.24.250   <none>        5432/TCP   42m
sample-cluster-rw   ClusterIP   10.99.79.108    <none>        5432/TCP   42m

The sample-cluster-rw Service lets you connect to the primary node for read/write operations, sample-cluster-ro to standbys only for read-only operations, and sample-cluster-r to any node (including the primary) for read operations.

You can find more sample configurations with more features at https://cloudnative-pg.io/documentation/current/samples/.

How it works…

The operator defines a new Kubernetes resource called Cluster, representing a PostgreSQL cluster made up of a single primary and an optional number of physical replicas that co-exist in the chosen Kubernetes namespace for high availability and offloading of read-only queries.

Applications in the Kubernetes cluster can now access the Postgres database through the Service that the operator manages, without worrying about which node is primary and whether the primary changes due to a failover or switchover. For applications from outside the Kubernetes cluster, you need to expose Postgres via TCP by configuring a Service or Ingress object.

In our cluster, 1 GB of disk space was allocated for Postgres in the default Kubernetes storage. Be aware that we deployed Postgres with the default configuration, which is conservative and safe for testing on a laptop, but definitely not suitable for production usage.

You can find CNPG’s extensive documentation, which describes all you can do with the operator, including detailed Prometheus monitoring, backup and recovery, upgrades, migration, scaling, etc., and how to configure it for production use, at https://cloudnative-pg.io/documentation/current/.

There’s more...

CloudNativePG is able to react to the failure of a PostgreSQL instance by performing failover and/or creating new replicas, depending on what is needed to restore the desired state, which in our example is one primary node and two physical replicas.

We recommend this method for Kubernetes PostgreSQL deployments because it is not an attempt to shoehorn Postgres into Kubernetes with additional sidecar software to take care of the high availability aspect. It is built from the ground up with Postgres-specific resources, while respecting the cloud-native declarative conventions and using Kubernetes’s built-in facilities and features.

High availability has historically been a complex subject for PostgreSQL, as for other database systems, because the most difficult part is to diagnose failures correctly. The various middleware tools – for which we refer you to Chapter 12, Replication and Upgrades – employ a number of techniques to reduce the risk of doing the wrong thing due to a mistaken diagnosis.

Kubernetes changes the way high availability is achieved because it provides a very reliable interface for detecting node failures. CNPG is called “native” because it follows this approach strictly, and as a result it is becoming very popular in the Kubernetes world, probably also because people who are experienced with Kubernetes will recognize this approach as familiar and reliable.

CloudNativePG is the first PostgreSQL-related project to aim for CNCF certification through the Sandbox/Incubation/Graduation process. You can find the CNPG repository at https://github.com/cloudnative-pg/cloudnative-pg.