Identifying and fixing bloated tables and indexes
PostgreSQL implements Multiversion Concurrency Control (MVCC), which allows users to read data at the same time as writers make changes. This is an important feature for concurrency in database applications, as it can allow the following:
- Better performance because of fewer locks
- Greatly reduced deadlocking
- Simplified application design and management
Bloated tables and indexes are a natural consequence of MVCC design in PostgreSQL. It is caused mainly by updates, as we must retain both the old and new updates for a certain period of time.
Bloating results in increased disk consumption, as well as performance loss—if a table is twice as big as it should be, scanning it takes twice as long. VACUUM
is one of the best ways of removing bloat.
Many users execute VACUUM
far too frequently, while at the same time complaining about the cost of doing so. This recipe is all about understanding when you need to run VACUUM
by estimating the amount of bloat...