All we need is an easy explanation of the problem, so here it is.
I have a big table (let’s call it
big_table) in a PostgreSQL 12.7 database running on a Debian GNU/Linux server. The machine has 8 GB of RAM and 4 CPU cores, and mostly dedicated to this PostgreSQL server.
big_table has about 103 million rows that holds time series data (e.g. data is not updated, only inserted or deleted). Almost every month, I delete about 25 to 30 million rows from this table. Those deleted rows correspond to a consecutive time range (e.g. something like "DELETE … between 1-Mar and 30-Mar"). Those
DELETE operations take anywhere from 10 minutes to 30 minutes to complete. This kind of usage pattern is in effect for more than 6 months.
The table size is approximately 46 GB, and the indexes for that table add up to about 61 GB.
There is an application that inserts 10s of rows every few seconds to this
big_table, and another user-facing web application that reads from this table to plot some recent values.
Every time I run the
DELETE operation that deletes 25 to 30 millions of rows (at the end of the month), I see that AUTOVACUUM kicks in and starts to work.
I wanted to check some statistics about how those AUTOVACUUM and ran the following SQL query in that database:
SELECT relname, last_vacuum, vacuum_count, autovacuum_count, last_autovacuum, autoanalyze_count, last_autoanalyze FROM pg_stat_user_tables WHERE relname = 'big_table' ;
To my surprise, it gave the following result:
|relname |last_vacuum|vacuum_count|autovacuum_count|last_autovacuum|autoanalyze_count|last_autoanalyze| |------------|-----------|------------|----------------|---------------|-----------------|----------------| |big_Table | |0 |0 | |0 | |
I’m trying to understand the following:
- Why are
- Why are
Unfortunately the official documentation at https://www.postgresql.org/docs/12/monitoring-stats.html wasn’t very helpful for me.
- Can it be the case that AUTOVACUUM starts but can’t finish? And similar for ANALYZE?
- If so, what might be the root cause?
- What should I check to verify if routine vacuuming does what it’s supposed to do?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
If the database was ever shutdown uncleanly, all the stats are reset. You could check pg_stat_database. Or they can be reset manually, either for the database or for individual tables.
Can it be the case that AUTOVACUUM starts but can’t finish? And similar for ANALYZE?
Yes, but then there should be messages in the log file about it.
For example, if you constantly do things like restart the database, or create or drop indexes, that will interrupt autovacuuming and if you do those things frequently enough then autovacuum might never get to finish.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂