A query containing multiple joins becomes too slow when adding one more condition after the joins

All we need is an easy explanation of the problem, so here it is.

I have this query that finishes in less than a second on average:

SELECT
    a.trea_code,
    a.fin_year,
    a.bill_no AS tran_id,
    ((('Transfer FROM ' :: text || (a.ddo_code) :: text) || 'ON Bill No : ' :: text) || btrim(((a.bill_no) :: character(10)) :: text)) AS party_name,
    a.voucher_no,
    a.ddo_code,
    a.accounting_date AS final_date,
    fb.audit_batch_date AS prov_date,
    'D' :: text AS status,
    substr((b.rscheme_code) :: text, 1, 4) AS major_head,
    substr((b.rscheme_code) :: text, 5, 2) AS submajor_head,
    substr((b.rscheme_code) :: text, 7, 3) AS minor_head,
    substr((b.rscheme_code) :: text, 10, 2) AS sub_head,
    substr((b.rscheme_code) :: text, 12, 2) AS subsub_head,
    b.rscheme_code AS scheme_code,
    b.amount,
    'FTC' :: text AS ttype
FROM
    final_bill_master a
    LEFT JOIN final_bill_track fb ON
     a.fin_year = fb.fin_year
    AND a.trea_code = fb.trea_code
    AND a.bill_no = fb.bill_no
    LEFT JOIN final_bill_deduction b
     ON a.trea_code = b.trea_code
    AND a.fin_year = b.fin_year
    AND a.bill_no = b.bill_no
    AND (b.flag = 'N' :: bpchar)
WHERE
    a.ddo_code IN (
        select
        ddo_code
    from
        trea.field_ddo_map
    where
        field_dept_cd = :fieldDepartmentCode
    )

Here’s it’s explain statement result:

QUERY PLAN
Gather  (cost=1015.19..276482.46 rows=300306 width=315)
  Workers Planned: 2
  ->  Nested Loop Left Join  (cost=15.19..245451.86 rows=125128 width=315)
        ->  Nested Loop Left Join  (cost=14.64..157248.09 rows=125128 width=37)
              ->  Hash Join  (cost=14.21..82705.38 rows=125128 width=33)
                    Hash Cond: (a.ddo_code = field_ddo_map.ddo_code)
                    ->  Parallel Seq Scan on final_bill_master a  (cost=0.00..78451.21 rows=1084921 width=33)
                    ->  Hash  (cost=14.08..14.08 rows=10 width=8)
                          ->  HashAggregate  (cost=13.98..14.08 rows=10 width=8)
                                Group Key: field_ddo_map.ddo_code
                                ->  Bitmap Heap Scan on field_ddo_map  (cost=4.36..13.96 rows=10 width=8)
                                      Recheck Cond: (field_dept_cd = 'TAX'::bpchar)
                                      ->  Bitmap Index Scan on field_ddo_map_pk  (cost=0.00..4.35 rows=10 width=0)
                                            Index Cond: (field_dept_cd = 'TAX'::bpchar)
              ->  Index Scan using final_bill_track_pkey on final_bill_track fb  (cost=0.43..0.60 rows=1 width=21)
                    Index Cond: ((a.fin_year = fin_year) AND (a.trea_code = trea_code) AND (a.bill_no = bill_no))
        ->  Index Scan using "final_bill_deductionPK" on final_bill_deduction b  (cost=0.55..0.65 rows=1 width=39)
              Index Cond: ((a.fin_year = fin_year) AND (a.trea_code = trea_code) AND (a.bill_no = bill_no) AND (flag = 'N'::bpchar))

But If I add one more condition at the very end it becomes excruciatingly slow (upto 10 MINUTES at worst).

SELECT
    a.trea_code,
    a.fin_year,
    a.bill_no AS tran_id,
    ((('Transfer FROM ' :: text || (a.ddo_code) :: text) || 'ON Bill No : ' :: text) || btrim(((a.bill_no) :: character(10)) :: text)) AS party_name,
    a.voucher_no,
    a.ddo_code,
    a.accounting_date AS final_date,
    fb.audit_batch_date AS prov_date,
    'D' :: text AS status,
    substr((b.rscheme_code) :: text, 1, 4) AS major_head,
    substr((b.rscheme_code) :: text, 5, 2) AS submajor_head,
    substr((b.rscheme_code) :: text, 7, 3) AS minor_head,
    substr((b.rscheme_code) :: text, 10, 2) AS sub_head,
    substr((b.rscheme_code) :: text, 12, 2) AS subsub_head,
    b.rscheme_code AS scheme_code,
    b.amount,
    'FTC' :: text AS ttype
FROM
    final_bill_master a
    LEFT JOIN final_bill_track fb ON
     a.fin_year = fb.fin_year
    AND a.trea_code = fb.trea_code
    AND a.bill_no = fb.bill_no
    LEFT JOIN final_bill_deduction b
     ON a.trea_code = b.trea_code
    AND a.fin_year = b.fin_year
    AND a.bill_no = b.bill_no
    AND (b.flag = 'N' :: bpchar)
WHERE
    a.ddo_code IN (
        select
        ddo_code
    from
        trea.field_ddo_map
    where
        field_dept_cd = :fieldDepartmentCode
    )
    AND a.fin_year = :financialYear

Here’s its explain statement result:

QUERY PLAN
Nested Loop Semi Join  (cost=5.77..39.25 rows=1 width=315)
  Join Filter: (a.ddo_code = field_ddo_map.ddo_code)
  ->  Nested Loop Left Join  (cost=1.41..25.12 rows=1 width=59)
        Join Filter: ((a.fin_year = b.fin_year) AND (a.trea_code = b.trea_code) AND (a.bill_no = b.bill_no))
        ->  Nested Loop Left Join  (cost=0.86..16.91 rows=1 width=37)
              Join Filter: ((a.fin_year = fb.fin_year) AND (a.trea_code = fb.trea_code) AND (a.bill_no = fb.bill_no))
              ->  Index Scan using final_bill_master_pkey on final_bill_master a  (cost=0.43..8.45 rows=1 width=33)
                    Index Cond: (fin_year = '2022-2023'::bpchar)
              ->  Index Scan using final_bill_track_pkey on final_bill_track fb  (cost=0.43..8.45 rows=1 width=21)
                    Index Cond: (fin_year = '2022-2023'::bpchar)
        ->  Index Scan using "final_bill_deductionPK" on final_bill_deduction b  (cost=0.55..8.19 rows=1 width=39)
              Index Cond: ((fin_year = '2022-2023'::bpchar) AND (flag = 'N'::bpchar))
  ->  Bitmap Heap Scan on field_ddo_map  (cost=4.36..13.96 rows=10 width=8)
        Recheck Cond: (field_dept_cd = 'TAX'::bpchar)
        ->  Bitmap Index Scan on field_ddo_map_pk  (cost=0.00..4.35 rows=10 width=0)
              Index Cond: (field_dept_cd = 'TAX'::bpchar)

Somebody please tell me what is causing this and how can I remedy it?

EDIT: Added EXPLAIN(ANALYZE, BUFFERS) statement results
Without the additional condition:

QUERY PLAN
Gather  (cost=1015.19..276486.57 rows=300311 width=315) (actual time=2.242..667.881 rows=28992 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=231819 read=66389
  ->  Nested Loop Left Join  (cost=15.19..245455.47 rows=125130 width=315) (actual time=0.692..641.730 rows=9664 loops=3)
        Buffers: shared hit=231819 read=66389
        ->  Nested Loop Left Join  (cost=14.64..157250.43 rows=125130 width=37) (actual time=0.612..510.599 rows=9377 loops=3)
              Buffers: shared hit=113925 read=66389
              ->  Hash Join  (cost=14.21..82706.60 rows=125130 width=33) (actual time=0.545..377.093 rows=9377 loops=3)
                    Hash Cond: (a.ddo_code = field_ddo_map.ddo_code)
                    Buffers: shared hit=1269 read=66389
                    ->  Parallel Seq Scan on final_bill_master a  (cost=0.00..78452.37 rows=1084937 width=33) (actual time=0.050..200.565 rows=867000 loops=3)
                          Buffers: shared hit=1214 read=66389
                    ->  Hash  (cost=14.08..14.08 rows=10 width=8) (actual time=0.134..0.136 rows=6 loops=3)
                          Buckets: 1024  Batches: 1  Memory Usage: 9kB
                          Buffers: shared hit=23
                          ->  HashAggregate  (cost=13.98..14.08 rows=10 width=8) (actual time=0.125..0.128 rows=6 loops=3)
                                Group Key: field_ddo_map.ddo_code
                                Buffers: shared hit=23
                                ->  Bitmap Heap Scan on field_ddo_map  (cost=4.36..13.96 rows=10 width=8) (actual time=0.078..0.113 rows=10 loops=3)
                                      Recheck Cond: (field_dept_cd = 'TAX'::bpchar)
                                      Heap Blocks: exact=5
                                      Buffers: shared hit=23
                                      ->  Bitmap Index Scan on field_ddo_map_pk  (cost=0.00..4.35 rows=10 width=0) (actual time=0.057..0.057 rows=10 loops=3)
                                            Index Cond: (field_dept_cd = 'TAX'::bpchar)
                                            Buffers: shared hit=8
              ->  Index Scan using final_bill_track_pkey on final_bill_track fb  (cost=0.43..0.60 rows=1 width=21) (actual time=0.012..0.012 rows=1 loops=28131)
                    Index Cond: ((a.fin_year = fin_year) AND (a.trea_code = trea_code) AND (a.bill_no = bill_no))
                    Buffers: shared hit=112656
        ->  Index Scan using "final_bill_deductionPK" on final_bill_deduction b  (cost=0.55..0.65 rows=1 width=39) (actual time=0.011..0.011 rows=0 loops=28131)
              Index Cond: ((a.fin_year = fin_year) AND (a.trea_code = trea_code) AND (a.bill_no = bill_no) AND (flag = 'N'::bpchar))
              Buffers: shared hit=117894
Planning Time: 2.794 ms
Execution Time: 670.567 ms

With the additional condition:

QUERY PLAN
Nested Loop Semi Join  (cost=5.77..39.25 rows=1 width=315) (actual time=4411.875..167336.582 rows=178 loops=1)
  Join Filter: (a.ddo_code = field_ddo_map.ddo_code)
  Rows Removed by Join Filter: 191568
  Buffers: shared hit=153665416 read=59 dirtied=12
  ->  Nested Loop Left Join  (cost=1.41..25.12 rows=1 width=59) (actual time=15.189..167037.964 rows=19277 loops=1)
        Join Filter: ((a.fin_year = b.fin_year) AND (a.trea_code = b.trea_code) AND (a.bill_no = b.bill_no))
        Rows Removed by Join Filter: 150227132
        Buffers: shared hit=153531047 read=59 dirtied=12
        ->  Nested Loop Left Join  (cost=0.86..16.91 rows=1 width=37) (actual time=0.111..73048.167 rows=17117 loops=1)
              Join Filter: ((a.fin_year = fb.fin_year) AND (a.trea_code = fb.trea_code) AND (a.bill_no = fb.bill_no))
              Rows Removed by Join Filter: 146487286
              Buffers: shared hit=85028813 read=59 dirtied=12
              ->  Index Scan using final_bill_master_pkey on final_bill_master a  (cost=0.43..8.45 rows=1 width=33) (actual time=0.057..107.777 rows=17117 loops=1)
                    Index Cond: (fin_year = '2022-2023'::bpchar)
                    Buffers: shared hit=9404 read=36 dirtied=5
              ->  Index Scan using final_bill_track_pkey on final_bill_track fb  (cost=0.43..8.45 rows=1 width=21) (actual time=0.009..2.673 rows=8559 loops=17117)
                    Index Cond: (fin_year = '2022-2023'::bpchar)
                    Buffers: shared hit=85019392 read=23 dirtied=2
        ->  Index Scan using "final_bill_deductionPK" on final_bill_deduction b  (cost=0.55..8.19 rows=1 width=39) (actual time=0.013..3.973 rows=8777 loops=17117)
              Index Cond: ((fin_year = '2022-2023'::bpchar) AND (flag = 'N'::bpchar))
              Buffers: shared hit=68502234
  ->  Bitmap Heap Scan on field_ddo_map  (cost=4.36..13.96 rows=10 width=8) (actual time=0.008..0.011 rows=10 loops=19277)
        Recheck Cond: (field_dept_cd = 'TAX'::bpchar)
        Heap Blocks: exact=95815
        Buffers: shared hit=134369
        ->  Bitmap Index Scan on field_ddo_map_pk  (cost=0.00..4.35 rows=10 width=0) (actual time=0.005..0.005 rows=10 loops=19277)
              Index Cond: (field_dept_cd = 'TAX'::bpchar)
              Buffers: shared hit=38554
Planning Time: 2.582 ms
Execution Time: 167336.873 ms

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

The problem seems to be a mis-estimate here:

->  Index Scan using "final_bill_deductionPK" on final_bill_deduction b  (... rows=1 ...) (actual ... rows=8777 ...)
      Index Cond: ((fin_year = '2022-2023'::bpchar) AND (flag = 'N'::bpchar))
      Buffers: shared hit=68502234

If a plain ANALYZE on that table does not help, you could try extended statistics:

CREATE STATISTICS year_flag (dependencies)
   ON fin_year, flag FROM final_bill_deduction;

ANALYZE final_bill_deduction;

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply