How to move old data to cheap hard drive

All we need is an easy explanation of the problem, so here it is.

I have a quite big PostgreSQL database (with timescale plugin). Right now it is consuming about 500Gb on a SSD. Most of the data is in the form of time series. In most cases data older than a few months isn’t really interesting.

My idea was to move that data to a cheap SATA hard drive instead of buying more expensive SSD’s. Is that a good idea, and is there some good practice for implementing?

My naive implementation would be:
Keep two databases (or create a tablespace on the cheap HDD). Fetch every few hours data from “fast” (SSD) database to the “slow” database (HDD). Every few days, delete data from the slow database. Is this a good idea? I am happy to hear some feedback and better suggestions.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Here is a better architecture:

  • Create a new tablespace on the slow drive.

  • Set the storage parameters seq_page_cost and random_page_cost higher on that new tablespace so that the PostgreSQL optimizer knows that the disks are slower.

  • Partition the big time series tables by time ranges (use the same boundaries for all affected tables) so that you end up with a couple of dozen partitions for each.

  • Move the old partitions to the slow tablespace.

Then you still have all the data accessible.

Use PostgreSQL v11 or better for partitioning.

Method 2

PM from Timescale here. Our next release is scheduled to introduce compression, so that will help with things. Tablespaces are a good idea, and we are planning on eventually allowing you to move older chunks to different tablespaces. Please feel free to reach out to us for further info on that functionality, since it’s still being developed.

PS, make sure to use drop_chunks when you delete data so as not to create too many tombstones that end up having to be vacuumed.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply