Potentially Improper Row Count Read in Parallel Bitmap Hash Match Plan

All we need is an easy explanation of the problem, so here it is.

I am trying to determine whether or not this is an execution plan reporting bug or an actual feature that is working as intended.

In my execution plan, when using MAXDOP 1, I am seeing the entire table being scanned (https://www.brentozar.com/pastetheplan/?id=HkPFquLdc – Very bottom of plan Object 11 shows ~2.5 million rows read which is the entire table).

However, when I let the engine choose it’s own plan with no hints, it goes parallel (https://www.brentozar.com/pastetheplan/?id=r12p5OUdc) and does a bitmap/hash match and that same Object11 only shows ~534k rows read despite doing an index scan and having a bunch of predicates on other columns not in the clustered index.

I would expect that SQL has to read every single row in the table to evaluate each predicate, but perhaps the PROBE IN (you can’t see this in paste the plan) on Object 11 in the parallel plan is able to "Filter out pages" since the probe is on the PK/CX.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

It is a feature working as intended. In parallel row mode plans, SQL Server can introduce a bitmap build on the build input of a hash join to keep a rough track of the join keys it encounters. When the hash join transitions from building its hash table to probing the second input for matches, the bitmap is transferred to the probe side and pushed down that branch of the plan as far as possible.

This technique is known as early semi-join reduction. The approximate bitmap is used to filter out rows that cannot possibly join. In the best case, the bitmap can be pushed all the way down the probe branch as far as the data access method, and performed INROW.

When the INROW optimization is applied, a reduced size bitmap is applied by the storage engine while pages are being read. Any rows that pass this filter are surfaced to the query processor, which applies the full size bitmap (potentially eliminating even more rows) before any residual predicates on the scan or seek are applied. Only rows that pass both bitmap tests and the residual predicates are seen leaving the seek or scan operator.

More details in my articles:

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply