Book Image

Learn PostgreSQL

By : Luca Ferrari, Enrico Pirozzi
Book Image

Learn PostgreSQL

By: Luca Ferrari, Enrico Pirozzi

Overview of this book

PostgreSQL is one of the fastest-growing open source object-relational database management systems (DBMS) in the world. As well as being easy to use, it’s scalable and highly efficient. In this book, you’ll explore PostgreSQL 12 and 13 and learn how to build database solutions using it. Complete with hands-on tutorials, this guide will teach you how to achieve the right database design required for a reliable environment. You'll learn how to install and configure a PostgreSQL server and even manage users and connections. The book then progresses to key concepts of relational databases, before taking you through the Data Definition Language (DDL) and commonly used DDL commands. To build on your skills, you’ll understand how to interact with the live cluster, create database objects, and use tools to connect to the live cluster. You’ll then get to grips with creating tables, building indexes, and designing your database schema. Later, you'll explore the Data Manipulation Language (DML) and server-side programming capabilities of PostgreSQL using PL/pgSQL, before learning how to monitor, test, and troubleshoot your database application to ensure high-performance and reliability. By the end of this book, you'll be well-versed with the Postgres database and be able to set up your own PostgreSQL instance and use it to build robust solutions.
Table of Contents (27 chapters)
1
Section 1: Getting Started
5
Section 2: Interacting with the Database
12
Section 3: Administering the Cluster
20
Section 4: Replication
23
Section 5: The PostegreSQL Ecosystem

VACUUM

In the previous sections, you have learned how PostgreSQL exploits MVCC to store different versions of the same data (tuples) that different transactions can perceive depending on their active snapshot. However, keeping different versions of the same tuples requires extra space with regard to the last active version, and this space could fill your storage sooner or later. To prevent that, and reclaim storage space, PostgreSQL provides an internal tool named vacuum, the aim of which is to analyze stored tuple versions and remove the ones that are no longer perceivable.

Remember: a tuple is not perceivable when there are no more active transactions that can reference the version, which means having the tuple version within their snapshot.

Vacuum can be an I/O-intensive operation since it must reclaim no more used disk space, and therefore can be an invasive operation. For that reason, you are not supposed to run vacuum very frequently and PostgreSQL also provides a background job...