Book Image

Learn PostgreSQL - Second Edition

By : Luca Ferrari, Enrico Pirozzi
1 (2)
Book Image

Learn PostgreSQL - Second Edition

1 (2)
By: Luca Ferrari, Enrico Pirozzi

Overview of this book

The latest edition of this PostgreSQL book will help you to start using PostgreSQL from absolute scratch, helping you to quickly understand the internal workings of the database. With a structured approach and practical examples, go on a journey that covers the basics, from SQL statements and how to run server-side programs, to configuring, managing, securing, and optimizing database performance. This new edition will not only help you get to grips with all the recent changes within the PostgreSQL ecosystem but will also dig deeper into concepts like partitioning and replication with a fresh set of examples. The book is also equipped with Docker images for each chapter which makes the learning experience faster and easier. Starting with the absolute basics of databases, the book sails through to advanced concepts like window functions, logging, auditing, extending the database, configuration, partitioning, and replication. It will also help you seamlessly migrate your existing database system to PostgreSQL and contains a dedicated chapter on disaster recovery. Each chapter ends with practice questions to test your learning at regular intervals. By the end of this book, you will be able to install, configure, manage, and develop applications against a PostgreSQL database.
Table of Contents (22 chapters)
20
Other Books You May Enjoy
21
Index

What this book covers

Chapter 1, Introduction to PostgreSQL, explains what the PostgreSQL database is, the community and development behind this great and robust enterprise-level relational database, and how to get help and recognize different PostgreSQL versions and dependencies. You will also learn how to get and install PostgreSQL either through binary packages or by compiling it from sources. You will see how to manage the cluster with your operating system tools (systemd and rc scripts).

Chapter 2, Getting to Know Your Cluster, shows you the anatomy of a PostgreSQL cluster by specifying what is on the file system, where the main configuration files are, and how they are used. The psql command-line utility is described in order to help you connect to the database cluster and interact with it.

Chapter 3, Managing Users and Connections, provides a complete description of how users and connections are managed by a running instance and how you can prevent or limit users from connecting. The concept of the “role” is described, and you will learn how to create single-user accounts, as well as groups of related users.

Chapter 4, Basic Statements, shows how to create and destroy main database objects, such as databases, tables, and schemas. The chapter also covers basic statements, such as SELECT, INSERT, UPDATE, and DELETE. This chapter shows how to manage the public schema on PostgreSQL 16.

Chapter 5, Advanced Statements, introduces the advanced statements PostgreSQL provides, such as common table expressions, MERGE, UPSERTs, and queries with RETURNING rows. This chapter will provide practical examples of when and how to use them.

Chapter 6, Window Functions, introduces a powerful set of functions that provide aggregation without having to collapse the result in a single row. In other words, thanks to window functions, you can perform aggregation on multiple rows (windows) and still present all the tuples in the output. Window functions allow the implementation of business intelligence and make reporting easy.

Chapter 7, Server-Side Programming, tackles the fact that while SQL is fine for doing most day-to-day work with a database, you could end up with a particular problem that requires an imperative approach. This chapter shows you how to implement your own code within the database, how to write functions and procedures in different languages, and how to make them interact with transaction boundaries.

Chapter 8, Triggers and Rules, presents both triggers and rules with practical examples, showing advantages and drawbacks. The chapter ends with examples about event triggers.

Chapter 9, Partitioning, explores partitioning – splitting a table into smaller pieces. PostgreSQL has supported partitioning for a long time, but with version 10 it introduced so-called “declarative partitioning.” This chapter focuses on all the features related to declarative partitioning, its tuning parameters, and how to make a table partitioning using different tablespaces.

Chapter 10, Users, Roles, and Database Security, first looks at user management: roles, groups, and passwords. You will learn how to constrain users to access only particular databases and from particular machines, as well as how to manage permissions associated to users and database objects. You then will see how row-level security can harden your table contents and prevent users from retrieving or modifying tuples that do not belong to them.

Chapter 11, Transactions, MVCC, WALs, and Checkpoints, presents fundamental concepts in PostgreSQL: the Write-Ahead Log (WAL) and the machinery that allows the database to run concurrent transactions and consolidate data in storage. The chapter also presents the concept of transaction isolation, ACID rules, and how the database can implement them. Then you will discover how the WAL can speed up database work and, at the very same time, protect it against crashes. You will understand what MVCC is and why it is important. Lastly, the chapter provides insight into checkpoints and related tunables.

Chapter 12, Extending the Database – the Extension Ecosystem, introduces a handy way to plug new functionalities into your cluster by using so-called “extensions.” This chapter will show you what an extension is; how to search for, get, and install a third-party extension; and how to develop your own.

Chapter 13, Query Tuning, Indexes, and Performance Optimization, addresses an important topic for any database administrator: performance. Indexes are fast ways to help the database access the most commonly used data, but they cannot be built on top of everything because of their maintenance costs. The chapter presents the available index types, and then it explains how to recognize tables and queries that could benefit from indexes and how to deploy them. Thanks to tools such as explain and autoexplain, you will keep your queries under control.

Chapter 14, Logging and Auditing, tackles questions such as “What is happening in the database cluster?” and “What happened yesterday?” Having a good logging and auditing ruleset is a key point in the administration of a database cluster. The chapter presents you with the main options for logging, how to inspect logs with external utilities such as pgBadger, and how to audit your cluster (in a way that can help you make it compliant with data regulamentation policies, e.g., GDPR).

Chapter 15, Backup and Restore, explains why having a backup is important, how to take one for all or part of you cluster, and how to restore from a valid backup. The chapter presents the basic and most common ways to back up a single database or a whole cluster, as well as how to do archiving and point-in-time recovery.

Chapter 16, Configuration and Monitoring, presents the cluster configuration options and the PostgreSQL catalogs used to inspect the system from the inside. Different ways to tune the configuration will be presented. Thanks to special extensions, such as pg_stat_activity, you will be able to monitor in real time what your users are doing against the database.

Chapter 17, Physical Replication, covers built-in replication, a mechanism that allows you to keep several instances up and in sync with a single master node, which PostgreSQL has supported since version 9. Replication allows scalability and redundancy, as well as many other scenarios such as testing and comparing databases. This chapter presents so-called “physical replication,” a way to fully replicate a whole cluster over another instance that will continuously follow its leader. Both asynchronous and synchronous replication, as well as replication slots and delayed replication, will be presented.

Chapter 18, Logical Replication, covers logical replication, which allows very fine-grained replication specifying which tables have to be replicated and which don’t – supported by PostgreSQL since version 10. This, of course, allows a very new and rich scenario of data sharing across different database instances. The chapter presents how logical replication works, how to set it up, and how to monitor the replication.

Chapter 19, Useful Tools and Useful Extensions, should be considered as an appendix to the book. In this chapter, we will talk about some tools and extensions that allow a database administrator to maximize work done while minimizing effort.