All we need is an easy explanation of the problem, so here it is.
AWS Aurora PostgreSQL Serverless with autoscaling.
It appears as though scaling clears the shared buffer, so right when we want to crank out the performance, we are forced to face-plant with an I/O bottleneck. After we get warmed up, we see a great performance improvement.
However, if we run back-to-back once scaled, the second run goes much faster.
While I haven’t seen anything specific on whether the shared buffer gets cleared on scaling, I’m almost positive that it is.
Aurora Serverless is currently using
PostgreSQL 10.14 and it supports the
pg_prewarm extension. It looks like the newest documentation suggests that prewarm supports auto pre-warm after a server restart, but this is serverless and a version that doesn’t appear to mention auto pre-warming in the documentation.
I found this post that works great for PostgreSQL when restarting the server or recovering from a crash.
- If we could at least retain the contents of the shared buffer of the lower ACU node after scaling, that’d be fine.
- If we could pre-warm exactly what needs to be in memory ahead of time, that would be awesome!
- There are certain tables that are quite large and we would want to selectively pre-warm the pieces that we want.
last_blockblock numbers for a table/index, but how would one know what values to put in there?
We know ahead of time when our peak is and tell RDS to scale right before, so we have a window of time where we could prepare.
What are my options?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
My answer is not specific to AWS Aurora PostgreSQL Serverless, but for Postgres in general.
In your related comment, you hinted that you only need rows from the last 24 hours. So you could (without involving pg_prewarm) simply:
SELECT * FROM public.tbl WHERE created_at > now() - interval '24h';
created_at is indexed, and the predicate is selective enough, relevant blocks of table and index are prewarmed.
Since you don’t actually want to retrieve any data when prewarming, you could use
PERFORM in a
DO $$BEGIN PERFORM * FROM public.tbl WHERE created_at > now() - interval '24h'; END$$;
You can verify success with
EXPLAIN (ANALYZE, BUFFERS):
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM public.tbl WHERE created_at > now() - interval '24h';
If there is enough cache memory available, you should now only see "shared hit" buffers. Like:
Buffers: shared hit=123456
.. where you’d see mostly "read" with mostly cold cache. Like:
Buffers: shared hit=143 read=153689
Basically, just run expected queries, and the cache will be populated accordingly.
If you still want to use
pg_prewarm() with block numbers, you can do that, too. Allows more options, like picking which cache to populate (operating system or database buffer cache) or some other finesse. The additional module has to be installed first, once per database:
CREATE EXTENSION pg_prewarm;
Working with block numbers only makes sense if your table is (mostly) physically clustered on the assumed column
created_at. That would be the case with a read-only (read-mostly) table where new rows with the current timestamp are appended at the end of the table.
You can get the block number of a row from its
To get the block number of the first row that’s younger than 24 hours:
SELECT ctid FROM public.tbl WHERE created_at > now() - interval '24h' ORDER BY created_at LIMIT 1;
You get something like
5759 is the block number. Then you can:
SELECT pg_prewarm('public.tbl'::regclass, first_block => 5759)
Since we left
last_block at its default
NULL, everything "through the last block in the relation" will be prewarmed. (Not the index, though. You can prewarm that, too.)
The function call is with "mixed notation" (mix of "positional" and "named notation"). See:
I think you’re out of luck on #1. The release notes for PG 11 note the addition of that feature. And I’m not familiar with Aurora, don’t have any idea about #2.
For #3, it all depends on what the "pieces that we want" are. Over in the comments of the linked post, you mention that for one table, you want the last 24 hours of data. If this is an append-only table, the way to get the block number would be query for the
ctid of a row 24h old, then use
(ctid::text::point)::bigint to get the starting block number – on the assumption you have an index on timestamp. Then just prefetch from that block to
If it’s not an append-only table, whether you can prefetch just the parts you want depends entirely on the details; feel free to update question if so.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂