Date index SEEK is chosen when using OPTION RECOMPILE but not with OPTION OPTIMIZE FOR

All we need is an easy explanation of the problem, so here it is.

I have a table containing 10 years worth of ‘package scans’. Somebody scans a package and it records the date and username. Let’s pretend for now that retaining 10 years of data actually has a purpose.

I have a page to show a summary for the past week, so clearly I only want to read 1 week’s worth of data.

Here’s the query, to be run in SSMS twice, once with a hardcoded recent date and again with an old date in 2013. It’s originally a parameterized query, but in SSMS I’m replacing @p0 with the date:

SELECT [t0].[VerifyDate], [t0].[PackageId], [t0].[Username]
FROM [dbo].[PackageVerification] AS [t0]
INNER JOIN [dbo].[Package] AS [t1] ON [t1].[PackageId] = [t0].[PackageId]

WHERE ([t1].[PackageStatus] <> 99) AND ([t0].[VerifyDate] > @p0)   
ORDER BY [t0].[VerifyDate] DESC

Before I execute it, I’d like to introduce my date index.

Now my date index is not on my PackageVerification table, but instead is on a ‘helper view’ which performs the same join seen above. The query above is able to magically use this indexed view because I have SCHEMABINDING enabled.

CREATE NONCLUSTERED INDEX [IX_Helper_PackageVerification_USER_SCAN_HISTORY] ON [dbo].[Helper_PackageVerification]
(
    [VerifyDate] DESC,
    [PackageStatus] ASC
)
INCLUDE (
    [VerifyDateDate],
    [Username]
) 

When I run the query in SSMS with an old and new date it uses scan or seek as expected. The threshold seems to be somewhere around 2015. So anything remotely recent should definitely be using a seek. Here’s the results of that:

Date index SEEK is chosen when using OPTION RECOMPILE but not with OPTION OPTIMIZE FOR

When I run it as a parameterized query from my application I always get a full scan, which for some reason uses a parallelized plan.

At least it’s using my helper index.

Date index SEEK is chosen when using OPTION RECOMPILE but not with OPTION OPTIMIZE FOR

I’m actually not sure why I don’t get parameter sniffing for this. I always pass a very recent date so I would have thought it may have preferred a scan but I’m fine with it choosing the above plan given the circumstances. There’s a million+ rows and it takes about 150ms.

Incidentally this is a SQL Azure database with 2vCores. Parameter sniffing is enabled and parameterization is set to simple.

If I change the query and run my application using OPTION (RECOMPILE) I do get the desired SEEK and a very good performance of just a few ms. The recompile time seems to be negligible and frankly this is perfectly fine performance I can use.

When I look in query store I can verify OPTION RECOMPILE uses the seek for a recent date, and scan for an old date! Awesome.

However, and I’ve never tried this before – I thought how about improving it even further with OPTION (OPTIMIZE FOR @p0 = '4/1/2021').

I was expecting this to also use the seek, but without the need for recompilation every time. I’d just periodically change the date passed to OPTIMIZE FOR – maybe to the beginning of the previous month.

However, this is the query in the query store.

Date index SEEK is chosen when using OPTION RECOMPILE but not with OPTION OPTIMIZE FOR

And it goes and does a full scan of all 1+ million rows when setting the date parameter to 4/7/21!

So now I’m lost. I’ve tried to read about everything I can on the subject but haven’t come across this issue. RECOMPILE works, but OPTIMIZE FOR doesn’t seem to do anything when I’m expecting it to effectively simulate running the query in SSMS with hardcoded values.

Query plans

This first plan is the only unexpected plan – it’s a scan and I want a seek.

OPTIMIZE FOR @p1 = ‘2021/4/1’ – https://www.brentozar.com/pastetheplan/?id=H1JB43AUu
OPTIMIZE FOR BOTH PARAMS – https://www.brentozar.com/pastetheplan/?id=rkV9U3AUu
OPTION RECOMPILE – https://www.brentozar.com/pastetheplan/?id=SJ5cS3CUd

These are to prove that the optimizer knows that recent dates should be a seek!

HARDCODED 2013 – SCAN – https://www.brentozar.com/pastetheplan/?id=BkeA42RLu
HARDCODED 2015 – SEEK – https://www.brentozar.com/pastetheplan/?id=S1c8r3R8O

I’m starting to wonder if this version doesn’t support OPTIMIZE FOR, even though I can’t find anything saying it wouldn’t


Edit: (After Paul’s answer)

I tried a few additional things. First here is the VIEW definition I didn’t include before. This does a JOIN and since it uses SCHEMABINDING the optimizer is able to substitute for it:

CREATE VIEW [dbo].[Helper_PackageVerification] WITH SCHEMABINDING AS

SELECT

— Package Verification columns
[t0].PackageVerificationId,
[t0].Verfied, — spelling mistake from long ago!
[t0].VerifyDate, — this is non nullable in [t0] btw
[t0].Username,

— Package columns
[t1].PackageId,
[t1].PackageStatus,
[t1].PackedOnDate

FROM [dbo].[PackageVerification] AS [t0] INNER JOIN [dbo].[Package] AS [t1] ON [t1].[PackageId] = [t0].[PackageId]

WHERE (Verfied = 1 AND VerifyDate IS NOT NULL AND PackageStatus <> 99)
GO

The CLUSTERED index is on PackageVerificationId and the main NON CLUSTERED index is shown above. I actually created half a dozen convering indexes to see which it would pick.

  1. I hardcoded PackageStatus <> 99. It was originally a parameter.

  2. I tried adding NOT NULL to the filter on the view to see what would happen. That did actually give me a SEEK, but a useless one since the SEEK predicate was actually on VerifyDate IS NOT NULL.

https://www.brentozar.com/pastetheplan/?id=r1HlgF1Dd

You can’t add a filtered index to an indexed view, so even though the view filters out NOT NULL dates it probably can’t be matched against. So that could be the ultimate reason I was unable to get my date to be used for the SEEK predicate?

  1. I didn’t in this case try using the helper index directly in the query, but I’d pretty much expect that to work with NOEXPAND as I’m doing that elsewhere.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Using OPTIMIZE FOR is not the same as OPTION (RECOMPILE). The former uses supplied parameter values in cardinality estimation for a plan that might be reused with other parameter values. The recompile option embeds the runtime parameter value, and produces a disposable plan that will never be reused.

The OPTIMIZE FOR plan therefore needs to ensure correct operation for all possible values. The recompile plan can use additional optimizations that are only valid for the present value. It can also use optimizations that only work with literal values e.g. pushing a filter past a window function.

This matters in your case because when the OPTIMIZE FOR plan matches the indexed view, it adds additional IS NOT NULL residual predicates on VerifyDate and PackageStatus:

Date index SEEK is chosen when using OPTION RECOMPILE but not with OPTION OPTIMIZE FOR

The recompile plan can remove this logic because the supplied values are known to be not null. The presence of these extra implied predicates are enough to prevent index matching for a seek. It’s usually best to ensure source columns are constrained to be not null, or explicitly rejected in the indexed view definition to minimize this sort of thing.

Now, the optimizer has a wide variety of plan choices for your queries. One indication of that is the number of statistics objects loaded – 17. Small differences in the path taken through the optimizer can produce different outcomes.

Automatic indexed view matching is a neat feature, technically, but it does have limitations. SQL Server needs to add some things and apply particular rewrites to achieve matching, which can have unexpected side-effects (note the reversed @p1 predicate above). The post-match plan is also not always completely cleaned up to match what a query written against the view would produce. These are not bugs, just implementation details.

I normally advise people to write queries against the view directly and specify a NOEXPAND hint, where this is practical. You might well find that testing your queries written that way would produce the outcomes you are looking for.

Related articles I have written:

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply