All we need is an easy explanation of the problem, so here it is.
I’ve PostgreSQL database table named worker_activity following structure
Column | Type | Collation | Nullable |
---------------+--------------------------------+-----------+----------+
id | bigint | | not null |
workerName | character varying(255) | | not null |
workerId | integer | | not null |
action | text | | not null |
created_at | timestamp(0) without time zone | | |
updated_at | timestamp(0) without time zone | | |
This table has almost 30 millions of rows and I have the query that creates weekly reports
select * from "worker_activity"
where "created_at" between $1 and $2
and ("action" = $3 or "action" = $4) order by "id" asc
So, that "plain" query executes nearly 3 minutes, it’s rather slow, isn’t it?
How can I speed it up using all features PostgreSQL provides?
Are window functions or some other aggregation method applicable in my case?
Additional info as members pointed out:
- Real row from the table
id| algorithmName|algorithmId |action |created_at|updated_at
3 | LiquidityMaker| 1 | {"step_1":{"action":"openOrderSell","orderSum":0.032,"orderPrice":2.3049,"counter":1},"step_2":{"action":"workTime","time":0.03714489936828613}}|2021-07-06 06:49:26|2021-07-06 06:49:26
- psql (PostgreSQL) 10.19 (Ubuntu 10.19-0ubuntu0.18.04.1)
- Hardware KVM VPS, 32 Gb of RAM, 8 Core CPU, SSD Drive.
Here is the execution plan:
explain (analyze, buffers)
select * from "worker_activity"
where "created_at" between '2021-07-06 06:49:25' and '2021-07-07 00:00:00'
and ("action" = 'Start Work' or "action" = 'End Work')
order by "id" asc;
Sort (cost=4138424.27..4138424.29 rows=9 width=903) (actual time=235525.506..235571.749 rows=0 loops=1)
Sort Key: worker_activity_liquidity_maker_1.id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=120287 read=3750619
-> Gather (cost=1000.00..4138424.12 rows=9 width=903) (actual time=235525.494..235571.733 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=120287 read=3750619
-> Append (cost=0.00..4137423.22 rows=9 width=903) (actual time=235521.443..235521.462 rows=0 loops=3)
Buffers: shared hit=120287 read=3750619
-> Parallel Seq Scan on worker_activity_liquidity_maker_1 (cost=0.00..577947.97 rows=1 width=883) (actual time=30871.374..30871.375 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 1532803
Buffers: shared hit=493 read=539153
-> Parallel Seq Scan on worker_activity_liquidity_maker_2 (cost=0.00..505967.24 rows=1 width=935) (actual time=29927.739..29927.740 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 1273284
Buffers: shared hit=290 read=473816
-> Parallel Seq Scan on worker_activity_liquidity_maker_3 (cost=0.00..409953.82 rows=1 width=978) (actual time=24560.669..24560.670 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 986666
Buffers: shared hit=464 read=384922
-> Parallel Seq Scan on worker_activity_liquidity_maker_4 (cost=0.00..517571.53 rows=1 width=912) (actual time=30845.683..30845.684 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 1328712
Buffers: shared hit=602 read=483822
-> Parallel Seq Scan on worker_activity_liquidity_maker_5 (cost=0.00..412771.36 rows=1 width=969) (actual time=24726.327..24726.328 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 1003528
Buffers: shared hit=397 read=387242
-> Parallel Seq Scan on worker_activity_liquidity_maker_6 (cost=0.00..505008.02 rows=1 width=950) (actual time=30233.641..30233.642 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 1236177
Buffers: shared hit=444 read=473533
-> Parallel Seq Scan on worker_activity_liquidity_maker_7 (cost=0.00..452042.03 rows=1 width=871) (actual time=26881.948..26881.949 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 1215541
Buffers: shared hit=516 read=421139
-> Parallel Seq Scan on worker_activity_liquidity_maker_8 (cost=0.00..126317.74 rows=1 width=757) (actual time=199.055..199.056 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 389744
Buffers: shared hit=116649
-> Parallel Seq Scan on worker_activity_liquidity_maker_9 (cost=0.00..629843.52 rows=1 width=873) (actual time=37274.984..37274.985 rows=0 loops=3)
Filter: ((created_at >= '2021-07-06 06:49:25'::timestamp without time zone) AND (created_at <= '2021-07-07 00:00:00'::timestamp without time zone) AND ((action = 'Start Work'::text) OR (action = 'End Work'::text)))
Rows Removed by Filter: 1696199
Buffers: shared hit=432 read=586992
Planning time: 0.919 ms
Execution time: 235571.830 ms
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Method 1
That query could be made faster with the following index:
CREATE INDEX ON worker_activity (created_at, action);
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0