Optimize filter on SQL query on PostgreSQL

All we need is an easy explanation of the problem, so here it is.

I have a query with a high cost on filtering the results and I am thinking I should be adding an index to optimize the plan, but the indexes I tried till now had no impact. Can I optimize the query by adding composite index for filtered columns?
This is the plan:

Limit  (cost=3069.33..14926.59 rows=4 width=509) (actual time=258424.190..258424.197 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=1320.215..1320.535 rows=2045 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=10.687..1314.519 rows=2045 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Incremental Sort  (cost=2986.35..18414332.62 rows=6211 width=509) (actual time=258424.188..258424.189 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Presorted Key: booking_data.last_segment_arrival_at
        Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
        ->  Index Scan Backward using idx_booking_data_last_segment_arrival_at on booking_data  (cost=0.44..18414054.67 rows=6211 width=509) (actual time=48419.376..258424.093 rows=5 loops=1)
              Index Cond: (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))) AND (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) OR (bid = ANY ($0))))
              Rows Removed by Filter: 2315888
Planning Time: 2.132 ms
Execution Time: 258424.387 ms

And here is the query:

explain analyze
    SELECT *
      FROM booking_data
      WHERE booking_data.bid >= 1100000
        AND booking_data.is_deleted IS false
        AND booking_data.last_segment_arrival_at < '2022-06-13 13:36'
        AND (booking_data.user_id = 'NJ8QigsGcQCDOttoGsD3iS'
                 OR booking_data.bid = ANY (CAST(array((
                     SELECT DISTINCT booking_passengers.bid AS anon_2
                     FROM booking_passengers
                     WHERE booking_passengers.user_id = 'NJ8QigsGcQCDOttoGsD3iS')) AS BIGINT[]))
            )
        AND (booking_data.confirmation_sent IS true
                 AND booking_data.final_status != 'refunded'
                 OR booking_data.final_status = 'confirmed'
                 OR booking_data.confirmation_sent IS false
                        AND booking_data.final_status IN ('closed')
            )
      ORDER BY booking_data.last_segment_arrival_at DESC, booking_data.bid ASC
      LIMIT 4 OFFSET 0

current index on booking_data table:

create index idx_booking_data_final_status on booking_data (final_status);
create index idx_booking_data_user_id on booking_data (user_id);
create index idx_booking_data_last_segment_arrival_at on booking_data (last_segment_arrival_at);
create index idx_booking_data_first_segment_arrival_at on booking_data (first_segment_arrival_at);
create index idx_booking_data_confirmed_at on booking_data (confirmed_at);
create index idx_booking_data_booked_email on booking_data (booked, email);
create index idx_booking_data_first_last_segment_bid_user_id on booking_data (first_segment_arrival_at, last_segment_arrival_at, bid, user_id);

I’ve added the index:

CREATE index CONCURRENTLY idx_booking_data_user_id_last_segment_arrival_at on booking_data (user_id, last_segment_arrival_at);

which now has this plan on staging database (weaker instance with production data). This is plan:

Limit  (cost=13432.55..13432.56 rows=4 width=509) (actual time=11958.229..11958.235 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=2741.877..2742.215 rows=2053 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=18.064..2734.284 rows=2053 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Sort  (cost=13349.57..13365.09 rows=6210 width=509) (actual time=11958.227..11958.230 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Sort Method: top-N heapsort  Memory: 28kB
        ->  Bitmap Heap Scan on booking_data  (cost=195.64..13256.42 rows=6210 width=509) (actual time=3771.506..11952.815 rows=854 loops=1)
              Recheck Cond: ((((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)) OR ((bid = ANY ($0)) AND (bid >= 1100000)))
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))))
              Rows Removed by Filter: 10202
              Heap Blocks: exact=10935
              ->  BitmapOr  (cost=195.64..195.64 rows=12634 width=0) (actual time=3718.959..3718.961 rows=0 loops=1)
                    ->  Bitmap Index Scan on idx_booking_data_user_id_last_segment_arrival_at  (cost=0.00..176.81 rows=12625 width=0) (actual time=17.294..17.294 rows=11025 loops=1)
                          Index Cond: (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone))
                    ->  Bitmap Index Scan on booking_data_pkey  (cost=0.00..15.72 rows=10 width=0) (actual time=3701.663..3701.663 rows=2062 loops=1)
                          Index Cond: ((bid = ANY ($0)) AND (bid >= 1100000))
Planning Time: 2.263 ms
Execution Time: 11958.434 ms

After first run query’s execution time is quicker:

Limit  (cost=13432.55..13432.56 rows=4 width=509) (actual time=29.641..29.647 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=2.507..2.761 rows=2053 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=0.021..1.664 rows=2053 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Sort  (cost=13349.57..13365.09 rows=6210 width=509) (actual time=29.640..29.643 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Sort Method: top-N heapsort  Memory: 28kB
        ->  Bitmap Heap Scan on booking_data  (cost=195.64..13256.42 rows=6210 width=509) (actual time=11.942..28.832 rows=854 loops=1)
              Recheck Cond: ((((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)) OR ((bid = ANY ($0)) AND (bid >= 1100000)))
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))))
              Rows Removed by Filter: 10202
              Heap Blocks: exact=10935
              ->  BitmapOr  (cost=195.64..195.64 rows=12634 width=0) (actual time=10.139..10.140 rows=0 loops=1)
                    ->  Bitmap Index Scan on idx_booking_data_user_id_last_segment_arrival_at  (cost=0.00..176.81 rows=12625 width=0) (actual time=2.024..2.024 rows=11025 loops=1)
                          Index Cond: (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone))
                    ->  Bitmap Index Scan on booking_data_pkey  (cost=0.00..15.72 rows=10 width=0) (actual time=8.113..8.113 rows=2062 loops=1)
                          Index Cond: ((bid = ANY ($0)) AND (bid >= 1100000))
Planning Time: 0.404 ms
Execution Time: 29.765 ms

On production instance, all query runs are slow, even though it’s stronger instance (index idx_booking_data_user_id_last_segment_arrival_at is not used):

Limit  (cost=523.03..2268.86 rows=4 width=509) (actual time=28549.479..28549.482 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=155.070..155.307 rows=2053 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=0.414..153.733 rows=2053 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Incremental Sort  (cost=440.05..2710839.81 rows=6210 width=509) (actual time=28549.478..28549.479 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Presorted Key: booking_data.last_segment_arrival_at
        Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
        ->  Index Scan Backward using idx_booking_data_last_segment_arrival_at on booking_data  (cost=0.44..2710561.90 rows=6210 width=509) (actual time=2034.195..28549.417 rows=5 loops=1)
              Index Cond: (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))) AND (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) OR (bid = ANY ($0))))
              Rows Removed by Filter: 2323153
Planning Time: 1.845 ms
Execution Time: 28549.694 ms

And is this the answer about table analyzed?

SELECT schemaname, relname, last_analyze FROM pg_stat_all_tables WHERE relname = 'booking_passengers';

so running ANALYZE on both relevant tables:

ANALYZE VERBOSE public.booking_data;
ANALYZE VERBOSE public.booking_passengers;

Index on production still not used ๐Ÿ™

Your WHERE has 5 ANDed together blocks. How many rows does each one individually return, in the absence of the LIMIT?

select count(*) FROM booking_data WHERE bid >= 1100000 – 28208008
select count(*) FROM booking_data WHERE is_deleted IS false – 29249188
select count(*) FROM booking_data WHERE last_segment_arrival_at < '2022-06-13 13:36' – 23594003

select count(*)
FROM booking_data
WHERE (booking_data.user_id = 'NJ8QigsGcQCDOttoGsD3iS'
    OR booking_data.bid = ANY (CAST(array((
        SELECT DISTINCT booking_passengers.bid AS anon_2
        FROM booking_passengers
        WHERE booking_passengers.user_id = 'NJ8QigsGcQCDOttoGsD3iS')) AS BIGINT[]))
          )

11079

select count(*)
FROM booking_data
WHERE (booking_data.confirmation_sent IS true
                 AND booking_data.final_status != 'refunded'
                 OR booking_data.final_status = 'confirmed'
                 OR booking_data.confirmation_sent IS false
                        AND booking_data.final_status IN ('closed')
            )

17294003

I ran ANALYZE with higher statistics_target as suggested:

show default_statistics_target ;
set default_statistics_target to 1000;
ANALYZE VERBOSE public.booking_data;
ANALYZE VERBOSE public.booking_passengers;

but still index with user_id and last_segment_arrival_at is not used ๐Ÿ™

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

In the first plan, by following the index backwards it can provide rows already in the requested ordering, and can stop early once it finds the LIMIT of 4 which pass the query conditions plus enough more to break any ties on "last_segment_arrival_at". Based on that plan, we know it expects there to be 6211 rows which pass the conditions, but we don’t know how many rows actually do pass the conditions. Based on the second plan, we know it actually found 854 rows which passed those same conditions. Catching your limit from a field of 854 is a lot more work than catching your limit from a field of 6211, so it is not surprising that it misestimates the cost of this plan. There are other reasons it could be misestimated as well, but since we already have evidence for this reason why speculate beyond what we can see?

What leads to the incorrect row estimate? We don’t know that. It could be that one or more of the 5 top level ANDed together parts of the WHERE are misestimated (you didn’t answer my question about that) or it could be that they are correctly estimated separately, but misestimated when combined because they are not statistically independent of each other. It could be that just increasing the statistics sample size would fix the problem. This is easy to test, you would just need to increase default_statistics_target locally in your session, ANALYZE the two tables, and see if it makes a difference. If that works, then you would need to decide if you want to permanently increase default_statistics_target system-wide, or just override it for those tables.

So why does this planning error skunk the better plan on prod and not on staging? If you look at the first two plans, you see that the cost estimates of both are nearly equal, even though the run times are quite different. Since they are so close to start with, even small differences could cause them to swap order. So there doesn’t have to be a profound answer, it could just be dumb luck that one comes out on top on one system and the other on the other system. But it could also be a real systematic difference. If you look at the first and fourth plans, they have nearly identical row estimates, but quite different cost estimates, and that is probably a difference in the settings. Most likely either the effective_cache_size is higher on prod, or the random_page_cost is lower. Either of those might effect the two plans differently and so make the chosen plan change. Fortunately both of these setting can be overridden locally in your session, so it is pretty easy to test of changing them on prod to be the same as on stage causes it to pick the other plan.

If none of these things work, or if you just want to force the other plan on prod so you can at least the plan for it, you could probably do that by locally setting enable_incremental_sort to off, or by dropping the index idx_booking_data_last_segment_arrival_at, or by rewriting the ORDER BY to include some dummy arithmetic like:

ORDER BY booking_data.last_segment_arrival_at + interval '0' DESC, booking_data.bid ASC

Note: Use and implement method 1 because this method fully tested our system.
Thank you ๐Ÿ™‚

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply