Book Image

PostgreSQL High Performance Cookbook

By : Chitij Chauhan, Dinesh Kumar
Book Image

PostgreSQL High Performance Cookbook

By: Chitij Chauhan, Dinesh Kumar

Overview of this book

PostgreSQL is one of the most powerful and easy to use database management systems. It has strong support from the community and is being actively developed with a new release every year. PostgreSQL supports the most advanced features included in SQL standards. It also provides NoSQL capabilities and very rich data types and extensions. All of this makes PostgreSQL a very attractive solution in software systems. If you run a database, you want it to perform well and you want to be able to secure it. As the world’s most advanced open source database, PostgreSQL has unique built-in ways to achieve these goals. This book will show you a multitude of ways to enhance your database’s performance and give you insights into measuring and optimizing a PostgreSQL database to achieve better performance. This book is your one-stop guide to elevate your PostgreSQL knowledge to the next level. First, you’ll get familiarized with essential developer/administrator concepts such as load balancing, connection pooling, and distributing connections to multiple nodes. Next, you will explore memory optimization techniques before exploring the security controls offered by PostgreSQL. Then, you will move on to the essential database/server monitoring and replication strategies with PostgreSQL. Finally, you will learn about query processing algorithms.
Table of Contents (19 chapters)
PostgreSQL High Performance Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface

Configuring pgbench


In this recipe, we will be discussing how to configure the pgbench to perform various test cases.

Getting ready

By default, PostgreSQL provides a tool, pgbench, which performs a default test suite based on TPC-B, which simulates the live database load on the servers. Using this tool, we can estimate the tps (transactions per second) capacity of the server, by conducting a dedicated read or read/write test cases. Before performing the pgbench test cases on the server, we need to fine-tune the PostgreSQL parameters and make them ready to fully utilize the server resources. Also, it's good practice to run pgbench from a remote machine, where the network latency is trivial among the nodes.

How to do it...

As aforementioned, pgbench simulates a TPC-B-like workload on the servers, by executing three update statements, followed by SELECT and INSERT statements into different pre-defined pgbench tables, and if we want to use those pre-defined tables, then we would need to initiate pgbench using the -i or --initialize options. Otherwise, we can write a customized SQL script.

To get effective results from pgbench, we need to fine-tune the PostgreSQL server with the following parameters:

Parameter

Description

shared_buffers

This is the amount of memory for database operations

huge_pages

This improves the OS-level memory management

work_mem

This is the amount of memory for each backend for data operations such as sort, join, and so on

autovacuum_work_mem

This is the amount of memory for postgres internal process autovacuum

max_worker_processes

This is the maximum number of worker processes for the database

max_parallel_workers_per_gather

This is the number of worker processes to consider for a gather node type

max_connections

This is the number of database connections

backend_flush_after

Do fsync, after this many bytes have been flushed to disk by each user process

checkpoint_completion_target

This is used to set I/O usage ratio during the checkpoint

archive_mode

This is used to determine whether the database should run in the archive mode

log_lock_waits

This is used to log useful information about concurrent database locks

log_temp_files

This is used to log information about the database's temporary files

random_page_cost

This is used to set random page cost value for the index scans

Note

You can also find other parameters at the following URL, which are also important before conducting any benchmarking on the database server: https://www.postgresql.org/docs/9.6/static/runtime-config.html.

Another good practice to get good performance is to keep the transaction logs (pg_xlog) in another mount point, and also have unique tablespaces for tables and indexes. While performing the pgbench testing with predefined tables, we can specify these unique tablespaces using the --index-tablespace and --tablespace options.

How it works...

As we discussed earlier, pgbench is a TPC-B benchmarking tool for PostgreSQL, which simulates the live transactions load on the database server by collecting the required metrics such as tps, latency, and so on. Using pgbench, we can also increase the database size by choosing the test scale factor while using predefined tables. If you wanted to test multiple concurrent connections to the database and wanted to use the pooling mechanism, then it's good practice to configure the pgbouner/pgpool on the local database node to reuse the connections.

Note

For more features and options with the pgbench tool, visit https://www.postgresql.org/docs/9.6/static/pgbench.html.