Book Image

PostgreSQL 11 Administration Cookbook

By : Simon Riggs, Gianni Ciolli, Sudheer Kumar Meesala
Book Image

PostgreSQL 11 Administration Cookbook

By: Simon Riggs, Gianni Ciolli, Sudheer Kumar Meesala

Overview of this book

PostgreSQL is a powerful, open source database management system with an enviable reputation for high performance and stability. With many new features in its arsenal, PostgreSQL 11 allows you to scale up your PostgreSQL infrastructure. This book takes a step-by-step, recipe-based approach to effective PostgreSQL administration. The book will introduce you to new features such as logical replication, native table partitioning, additional query parallelism, and much more to help you to understand and control, crash recovery and plan backups. You will learn how to tackle a variety of problems and pain points for any database administrator such as creating tables, managing views, improving performance, and securing your database. As you make steady progress, the book will draw attention to important topics such as monitoring roles, backup, and recovery of your PostgreSQL 11 database to help you understand roles and produce a summary of log files, ensuring high availability, concurrency, and replication. By the end of this book, you will have the necessary knowledge to manage your PostgreSQL 11 database efficiently.
Table of Contents (19 chapters)
Title Page
Copyright and Credits
About Packt

Randomly sampling data

DBAs may be asked to set up a test server and populate it with test data. Often, that server will be old hardware, possibly with smaller disk sizes. So, the subject of data sampling raises its head.

The purpose of sampling is to reduce the size of the dataset and improve the speed of later analysis. Some statisticians are so used to the idea of sampling that they may not even question whether its use is valid or if it can cause further complications.

The SQL standard way to perform sampling is by adding the TABLESAMPLE clause to the SELECT statement. 

How to do it…

In this section, we will take a random sample of a given collection of data (for example, a given table). First, you should realize that there isn't a simple tool to slice off a sample of your database. It would be neat if there were, but there isn't. You'll need to read all of this to understand why:

  1. We first consider using SQL to derive a sample. Random sampling is actually very simple because we can use the...