speedup postgres pg_restore

All we need is an easy explanation of the problem, so here it is.

Hi I have to update monthly my database using this big dump (100gb)
I don’t have the control over how it’s done I only receive a .zip with a lot of .gz
i want to speed up the restoring since I only really need 1 table.
I have a nvme for my disk, but it’s only getting written at like 200 mb/s but it’s capable of a lot more.
I think this is because it’s CPU bound the decompression of the .gz and I can’t find a way to paralyze the decompression.
the postgres version is 10.20
how can i check where is the table stored, in what file i mean.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

It is in a file whose base name is given by the first number in the relevant line from pg_restore -l

So for example:

4292; 0 17010 TABLE DATA public pgbench_branches jjanes

the data for pgbench_branches is in dumpdir/4292.dat.gz

But note that gz decompression does not parallelize well. The dictionary is built up as it goes, so you need to have decompressed the previous tokens to know how to decompress the current one.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply