Book Image

Learn PostgreSQL

By : Luca Ferrari, Enrico Pirozzi
Book Image

Learn PostgreSQL

By: Luca Ferrari, Enrico Pirozzi

Overview of this book

PostgreSQL is one of the fastest-growing open source object-relational database management systems (DBMS) in the world. As well as being easy to use, it’s scalable and highly efficient. In this book, you’ll explore PostgreSQL 12 and 13 and learn how to build database solutions using it. Complete with hands-on tutorials, this guide will teach you how to achieve the right database design required for a reliable environment. You'll learn how to install and configure a PostgreSQL server and even manage users and connections. The book then progresses to key concepts of relational databases, before taking you through the Data Definition Language (DDL) and commonly used DDL commands. To build on your skills, you’ll understand how to interact with the live cluster, create database objects, and use tools to connect to the live cluster. You’ll then get to grips with creating tables, building indexes, and designing your database schema. Later, you'll explore the Data Manipulation Language (DML) and server-side programming capabilities of PostgreSQL using PL/pgSQL, before learning how to monitor, test, and troubleshoot your database application to ensure high-performance and reliability. By the end of this book, you'll be well-versed with the Postgres database and be able to set up your own PostgreSQL instance and use it to build robust solutions.
Table of Contents (27 chapters)
1
Section 1: Getting Started
5
Section 2: Interacting with the Database
12
Section 3: Administering the Cluster
20
Section 4: Replication
23
Section 5: The PostegreSQL Ecosystem

What this book covers

Chapter 1, Introduction to PostgreSQL, explains what the PostgreSQL database is, the community and development behind this great and robust relational database, as well as how to get help and recognize different PostgreSQL versions and dependencies. You will also learn how to get and install PostgreSQL through either binary packages or by compiling it from sources. A glance at how to manage the cluster with your operating system tools (systemd and rc scripts) will be taken.

Chapter 2, Getting to Know Your Cluster, shows you the anatomy of a PostgreSQL cluster by specifying what is on the file system, where the main configuration files are, and how they are used. The psql command-line utility will be described in order to make you connect to the database cluster and check it's working.

Chapter 3, Managing Users and Connections, provides a complete description of how users and connections are managed by a running instance and how you can prevent or limit user connections. The architecture and terminology of the database will be detailed. The concept of "role" will be described, and you will learn how to create single-user accounts, as well as groups.

Chapter 4, Basic Statements, shows how to create and destroy main database objects, such as databases, tables, and schemas.
The chapter also takes a glance at basic statements, such as SELECT, INSERT, UPDATE, and DELETE.

Chapter 5, Advanced Statements, introduces the advanced statements PostgreSQL provides, such as common table expressions, UPSERTs, and queries with RETURNING rows. This chapter will provide practical examples of when and how to use them.

Chapter 6, Window Functions, introduces a powerful set of functions that provide aggregation without having to collapse the result in a single row. In other words, thanks to window functions, you can perform aggregation on multiple rows (windows) and still present all the tuples in the output. Window functions allow the implementation of business intelligence and make reporting easy.

Chapter 7, Server-Side Programming, tackles the fact that while SQL is fine for doing most of the day-to-day work with a database, you could end up with a particular problem that requires an imperative approach. This chapter shows you how to implement your own code within the database, how to write functions and procedures in different languages, and how to make them interact with transaction boundaries.
Chapter 8, Triggers and Rules, presents both triggers and rules with practical examples, showing advantages and drawbacks.

Chapter 9, Partitioning, explores partitioning – the capability to split a table into smaller pieces. PostgreSQL has supported partitioning for a long time, but with version 10 it introduced so-called "declarative partitioning." After having a quick lock at old-school inheritance-based partitioning, the chapter focuses on all the features related to declarative partitioning and its tuning parameters.

Chapter 10, Users, Roles, and Database Security, first glances at user management: roles, groups, and passwords.
You will learn how to constrain users to access only particular databases and from particular machines, as well as how to constrain the usage of database objects such as tables. You then will see how row-level security can harden your table contents and prevent users from modifying tuples that do not belong to them.

Chapter 11, Transactions, MVCC, WALs, and Checkpoints, presents a very fundamental concept in PostgreSQL: the Write-Ahead Log. You will learn why such a log is so important, how it deals with transactions, and how you can interact with transactions from a SQL point of view. The chapter also presents you with the concept of transaction isolation, ACID rules, and how the database can implement them. Then you will discover how the WAL can speed up database work and, at the very same time, can protect it against crashes. You will understand what MVCC is and why it is important. Lastly, the chapter provides insight into checkpoints and related tunables.

Chapter 12, Extending the Database, introduces a handy way to plug new functionalities into your cluster – extensions. This chapter will show you what an extension is, how to get and install an extension, and how to search for already available extensions in the PostgreSQL ecosystem.

Chapter 13, Indexes and Performance Optimization, addresses the fact that optimizing for performance is an important task for every database administrator. Indexes are fast ways to let the database access the most commonly used data, but they cannot be built on top of everything because of their maintenance costs. The chapter presents the available index types, then it explains how to recognize tables and queries that could benefit from indexes and how to deploy them. Thanks to tools such as explain and autoexplain, you will keep your queries under control.

Chapter 14, Logging and Auditing, tackles questions such as What is happening in the database cluster? What happened yesterday?
Having a good logging and auditing ruleset is a key point in the administration of a database cluster. The chapter presents you with the main options for logging, how to inspect logs with external utilities such as pgFouine, and how to audit your cluster (in a way that can help you make it compliant with GDPR).

Chapter 15, Backup and Restore, broaches the fact that things can go wrong, and in such cases, you need a good backup to promptly restore in order for your database to always be available. The chapter presents the basic and most common ways to back up a single database or a whole cluster, as well as how to do archiving and point-in-time recovery. External tools such as Barman and pg BackRest will be introduced.

Chapter 16, Configuration and Monitoring, presents the cluster catalog, the way in which PostgreSQL exports its own internal status. It does not matter how finely you tuned your cluster, you need to monitor it to understand and promptly adjust it to incoming needs. Knowing the catalog is fundamental for a database administrator, in order to be able to see what is going on in the live system. Thanks to special extensions, such as pg_stat_activity, you will be able to monitor in real time what your users are doing against the database.

Chapter 17, Physical Replication, covers built-in replication, a mechanism that allows you to keep several instances up and in sync with a single master node, which PostgreSQL has supported since version 9. Replication allows scalability and redundancy, as well as many other scenarios such as testing and comparing databases. This chapter presents so-called "physical replication," a way to fully replicate a whole cluster over another instance that will continuously follow its leader. Both asynchronous and synchronous replication, as well as replication slots, will be presented.

Chapter 18, Logical Replication, covers logical replication, which allows very fine-grained replication specifying which tables have to be replicated and which don't – supported by PostgreSQL since version 10. This, of course, allows a very new and rich scenario of data sharing across different database instances. The chapter presents how logical replication works, how to set it up, and how to monitor the replication.

Chapter 19, Useful Tools and Useful Extensions, is to be considered as an appendix to the book. In this chapter, we will talk about some tools and some extensions that allow the DBA to maximize the work done while minimizing the effort.

Chapter 20, Toward PostgreSQL 13, looks at the latest version of the database – PostgreSQL 13, which at the time of writing is in the beta-2 state. This chapter presents the main changes and highlights the differences between PostgreSQL 12 and version 13, and looks at how to upgrade to the new production-ready version once it is available.