Why does a clustered Key Lookup on a primary key have a high estimated rows per execution?

All we need is an easy explanation of the problem, so here it is.

We have a query with a Key Lookup which is estimating thousands of rows per execution. As I understand it, there should only ever be one row per execution. I understand statistics can be misleading but doesn’t the optimizer understand that a primary key would be unique?

The table involved in this query has a clustered primary key of this form:

/****** Object:  Index [PK_Table_Name]    Script Date: 6/16/2021 9:52:12 AM ******/
    [Table_Name_ID] ASC

As I understand it, a primary key provides a unique reference to a single row in the table.

So for every row it finds in the non-clustered index, it should be able to use that index’s reference to the clustered index to retrieve the single row it needs to satisfy the rest of the query filtering for the row it’s acting on. (This is a Nested Loops join operator.)

So why does it estimate almost 4000 rows will be returned as part of the Key Lookup? (Not "for All Executions", which at around 36,000,000 is the product of that 4000 and the 9000 rows it expects from the non-clustered index seek.)

Why does a clustered Key Lookup on a primary key have a high estimated rows per execution?

The runtime statistics show 2851 rows and 2851 executions for that clustered index seek, which is what I would have expected.

Why does a clustered Key Lookup on a primary key have a high estimated rows per execution?

In case it helps, this is in Azure SQL Database, with @@version:

Microsoft SQL Azure (RTM) - 12.0.2000.8 
    Apr 29 2021 13:52:20 
    Copyright (C) 2019 Microsoft Corporation

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Check out Paul White’s blog post Cardinality Estimation Bug with Lookups for the answer to your question:

Estimated row counts on Key or RID Lookups where a filtering predicate is applied can be wrong in SSMS execution plans.

This error does not affect the optimizer’s ultimate plan selection, but it does look odd.

The blog post goes on to say that a fix was planned back in August of 2020 to address this under the new 160 compatibility level:

There is a fix for this planned for the release after SQL Server 2019 (compatibility level 160)

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply