PostgreSQL: How to check live and dead rows in tables

Updated: January 5, 2024 By: Guest Contributor Post a comment

Introduction

Understanding the status of rows within PostgreSQL tables is crucial for maintaining the health of your database. This guide examines how to inspect live and dead rows for optimal database performance.

What are Live and Dead Rows?

In PostgreSQL, a row is considered ‘live’ if it’s accessible by queries and ‘dead’ if it has been deleted or obsoleted by an update. PostgreSQL has a feature called Multi-Version Concurrency Control (MVCC), which keeps old versions of a row around even after an update or delete, in case other transactions are still using it. Eventually, these dead rows can accumulate and cause database bloat. Identifying and cleaning up these dead rows is necessary for database health.

Basic: Using pg_stat_user_tables

The pg_stat_user_tables view provides real-time statistics about user-defined tables, including live and dead rows. Here’s a simple query to get started:

SELECT relname, n_live_tup, n_dead_tup
FROM pg_stat_user_tables;

This query returns the name of each table along with the respective count of live and dead tuples (rows).

Details with pg_stat_all_tables

To include system tables in the statistics, pg_stat_all_tables can be used:

SELECT relname, n_live_tup, n_dead_tup
FROM pg_stat_all_tables;

Remember, when working with system tables, be careful not to make unwarranted changes as it can affect the stability of the database.

Advanced: pgstattuple

For more granular insights, PostgreSQL provides a contrib module called pgstattuple. You need to install it on your database server and enable it per database with:

CREATE EXTENSION pgstattuple;

Once installed, you can analyze a specific table like so:

SELECT * FROM pgstattuple('public.your_table');

This function returns a more detailed report of the page and row level statistics, including delete- and update-chain lengths.

Tracking Row Variations Over Time

To understand how live and dead rows evolve, you may need temporal metrics:

SELECT now(), relname, n_live_tup, n_dead_tup
FROM pg_stat_all_tables;

By running this periodically, you can observe trends and identify issues.

Fetching Detailed Dead Row Information

For tables that have an abnormally high number of dead rows, you might want to drill down to see which rows are considered ‘dead’:

/* IMPORTANT: Run this query during low-activity periods. */

SELECT * FROM your_table
WHERE ctid NOT IN (
    SELECT ctid FROM your_table FOR SHARE
);

This query will return dead rows based on the visibility of the tuple identifier (ctid).

Autovacuum to the Rescue

PostgreSQL runs a process called autovacuum that helps to clean up dead tuples and prevent table bloat. To see autovacuum information for tables, use:

SELECT relname, last_autovacuum, last_autoanalyze
FROM pg_stat_user_tables;

This shows when each table was last auto-vacuumed and analyzed, aiding in monitoring the autovacuum process.

Scripting Regular Health Checks

Creating a maintenance script can help the ongoing monitoring of your tables. Here is an example:

SELECT
    relname AS "table",
    n_live_tup AS live_rows,
    n_dead_tup AS dead_rows,
    ROUND((n_dead_tup::float8 / NULLIF(n_live_tup + n_dead_tup, 0)) * 100, 2) AS dead_ratio
FROM
    pg_stat_user_tables
WHERE
    n_live_tup + n_dead_tup > 0 -- Exclude empty tables
ORDER BY
    dead_ratio DESC;

With this, you can regularly monitor the ratio of dead rows and catch potential issues early.

Manual Vacuuming

Sometimes you may need to manually intervene and run a vacuum. You can do so with:

VACUUM (VERBOSE) your_table;

This command manually vacuums the specified table, with VERBOSE providing detail on the operation.

Analyze for Query Optimization

It is also beneficial to run the ANALYZE command, which updates the statistics used by the PostgreSQL query planner.

ANALYZE your_table;

By doing so, you ensure that query optimizations are based on the most accurate information.

Conclusion

Monitoring live and dead rows in PostgreSQL is essential for database health. Employing the techniques detailed in this guide will assist in maintaining a trim and efficient database, ensuring top performance of your applications.