PostgreSQL 10 High Performance - Third Edition

By : Enrico Pirozzi

PostgreSQL 10 High Performance - Third Edition

By: Enrico Pirozzi

Overview of this book

PostgreSQL database servers have a common set of problems that they encounter as their usage gets heavier and requirements get more demanding. Peek into the future of your PostgreSQL 10 database's problems today. Know the warning signs to look for and how to avoid the most common issues before they even happen. Surprisingly, most PostgreSQL database applications evolve in the same way—choose the right hardware, tune the operating system and server memory use, optimize queries against the database and CPUs with the right indexes, and monitor every layer, from hardware to queries, using tools from inside and outside PostgreSQL. Also, using monitoring insight, PostgreSQL database applications continuously rework the design and configuration. On reaching the limits of a single server, they break things up; connection pooling, caching, partitioning, replication, and parallel queries can all help handle increasing database workloads. By the end of this book, you will have all the knowledge you need to design, run, and manage your PostgreSQL solution while ensuring high performance and high availability

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

PostgreSQL Versions

Performance of historical PostgreSQL releases

Choosing a version to deploy

Upgrading to a newer major version

Upgrades to PostgreSQL 8.3+ from earlier ones

PostgreSQL or another database?

PostgreSQL tools

PostgreSQL contrib

Finding contrib modules on your system

Installing a contrib module from source

Using a contrib module

pgFoundry

PGXN

Additional PostgreSQL-related software

PostgreSQL application scaling life cycle

Performance tuning as a practice

Summary

Database Hardware

Balancing hardware spending

Reliable controller and disk setup

Summary

Database Hardware Benchmarking

CPU and memory benchmarking

Physical disk performance

Disk benchmarking tools

Sample disk results

Summary

Disk Setup

Maximum filesystem sizes

Filesystem crash recovery

Linux filesystems

Solaris and FreeBSD filesystems

Disk layout for PostgreSQL

Summary

Memory for Database Caching

Memory units in postgresql.conf

Increasing Unix shared memory parameters for larger buffer sizes

Crash recovery and the buffer cache

Database buffer cache versus operating system cache

Analyzing buffer cache contents

Summary

Server Configuration Tuning

Interacting with the live configuration

Summary

Routine Maintenance

Transaction visibility with multiversion concurrency control

Vacuum

Index bloat

Detailed data and index page monitoring

Monitoring query logs

Summary

Database Benchmarking

pgbench default tests

Graphing results with pgbench-tools

Sample pgbench test results

Sources of bad results and variation

pgbench custom tests

Transaction Processing Performance Council benchmarks

Summary

Database Indexing

Indexing example walkthrough

Index creation and maintenance

Index types

Advanced index use

Summary

Query Optimization

Sample data sets

EXPLAIN basics

Query plan node structure

Explain analysis tools

Assembling row sets

Processing nodes

Joins

Statistics

Other query-planning parameters

Executing other statement types

Improving queries

SQL limitations

Summary

Database Activity and Statistics

Statistics views

Cumulative and live views

Table statistics

Index statistics

Database-wide totals

Connections and activity

Locks

Disk usage

Buffer, background writer, and checkpoint activity

Summary

Monitoring and Trending

UNIX monitoring tools

Windows monitoring tools

Trending software

Summary

Pooling and Caching

Connection pooling

Summary

Scaling with Replication

Hot Standby

Replication queue managers

Synchronous replication

Logical replication

Special application requirements

Other interesting replication projects

Replication solution comparison

Summary

Partitioning Data

Table range partitioning

PostgreSQL 10 – declarative partitioning – the built-in partitioning

Horizontal partitioning with PL/Proxy

Summary

Avoiding Common Problems

Bulk loading

Backup

Common performance issues

Foreign data wrapper

The amcheck module

pgAdmin

Performance-related features by version

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Performance tuning as a practice

Work on improving database performance has its own terminology, just like any other field. Here are some terms or phrases that will be used throughout the book; both of these terms will be used to refer to the current limitation that is preventing performance from getting better:

Running a test to determine how fast a particular operation can run. This is often done to figure out where the bottleneck of a program or system is.
Monitoring what parts of a program are using the most resources when running a difficult operation, such as a benchmark. This is typically to help prove where the bottleneck is, and whether it's been removed as expected after a change. Profiling a database application usually starts with monitoring tools, such as vmstat and iostat. Popular profiling tools at the code level include gprof, OProfile, and DTrace.

One of the interesting principles of performance tuning work is that, in general, you cannot figure out what bottleneck an application will next run into until you remove the current one. When presented with a system that's not as fast as someone would expect it to be, you'll often see people guessing what the current bottleneck is, or what the next one will be. That's generally a waste of time. You're always better off measuring performance, profiling the parts of the system that are slow, and using that to guess at causes and guide changes.

Let's say what you've looked at suggests that you should significantly increase shared_buffers, the primary tunable for memory used to cache database reads and writes. This normally has some positive impact, but there are potential negative things you could encounter instead. The information needed to figure out which category a new application will fall into, whether this change will increase or decrease performance, cannot be predicted from watching the server running with the smaller setting. This falls into the category of chaos theory: even a tiny change in the starting conditions can end up rippling out to a very different end condition, as the server makes millions of decisions and they can be impacted to a small degree by that change. Similarly, if is set too small, there are several other parameters that won't work as expected at all, such as those governing database checkpoints.

Since you can't predict what's going to happen most of the time, the mindset you need to adopt is one of heavy monitoring and change control.

Monitor as much as possible, from application to database server to hardware.

Introduce a small targeted change. Try to quantify what's different and be aware that some changes you have rejected as not positive won't always stay that way forever. Move the bottleneck to somewhere else, and you may discover that some parameter that didn't matter before is now suddenly the next limiting factor.

There's a popular expression on the mailing list devoted to PostgreSQL performance when people speculate about root causes without doing profiling to prove their theories: less talk, more gprof. While gprof may not be the tool of choice for every performance issue, given it's more of a code profiling tool than a general monitoring one, the idea that you measure as much as possible before speculating as to the root causes is always a sound one. You should also measure again to verify that your change did what you expected too.

Another principle that you'll find is a recurring theme in this book is that you must be systematic about investigating performance issues. Do not assume your server is fast because you bought it from a reputable vendor; benchmark the individual components yourself. Don't start your database performance testing with application level tests; run synthetic database performance tests that you can compare against other people's first. That way, when you run into the inevitable application slowdown, you'll already know your hardware is operating as expected and that the database itself is running well. Once your system goes into production, some of the basic things you might need to do in order to find a performance problem, such as testing hardware speed, become impossible to take the system down.

You'll be in much better shape if every server you deploy is tested with a common methodology, which is exactly what later chapters here lead you through. Just because you're not a hardware guy, it doesn't mean you should skip over the parts here that cover things such as testing your disk performance. You need to perform work like that as often as possible when exposed to new systems—that's the only way to get a basic feel of whether something is operated within the standard range of behavior or if instead there's something wrong.

PostgreSQL 10 High Performance - Third Edition

By : Enrico Pirozzi

PostgreSQL 10 High Performance - Third Edition

By: Enrico Pirozzi

Overview of this book

Related Content you might be interested in

Current Title:

PostgreSQL 10 High Performance - Third Edition

PostgreSQL 16 Administration Cookbook

PostgreSQL 11 Administration Cookbook

PostgreSQL 14 Administration Cookbook