UDF and built-in function behavior

All we need is an easy explanation of the problem, so here it is.

Take the following queries:

DECLARE @X VARCHAR(200) = '1,2,3,4'

SELECT
    *,
    dbo.aUserDefinedScalarFunction(4) AS ScalarValue
FROM
   MyTable T
   INNER JOIN dbo.aUserDefineTableFunction(@X) A ON T.SomeID=A.SomeID
WHERE 
  (T.ID1 IS NULL OR T.ID1 IN (SELECT [value] FROM STRING_SPLIT(@X,',')))
  AND
  (T.ID2 IS NULL OR T.ID2 IN (SELECT Value FROM dbo.MySplitterFunction(@X))

I normally create indexed #tempTables for the WHERE conditions above, I have found that to perform better on large datasets. However, I still can’t find definitive answers to the following questions:

  1. Will the query analyzer optimize aUserDefinedScalarFunction(4) As ScalarValue or does it get evaluated for each record?

  2. Will the INNER JOIN dbo.aUserDefineTableFunction(@X) get materialized into a temp table once or will it be executed for each record? The function returns table (not a table variable).

  3. Does the result of SELECT [value] FROM STRING_SPLIT(@X,’,’) get optimized or is it evaluated for each comparison?

  4. Does the result of SELECT Value FROM dbo.MySplitterFunction(@X) get optimized or is it evaluated during each comparison?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

  1. This depends on the function. For a deterministic function specified with schemabinding, the optimizer will usually be able to evaluate its result just once, caching the answer.

    For details, see:

  2. Inline table-valued functions are expanded into the calling query text before optimization, so there is no special opportunity for early evaluation and caching. That does not mean the execution plan won’t feature any cache-and-reuse optimizations. Where applicable, the optimizer will consider those (e.g. a spool) just as it would for a query without an inline function.

  3. Generally, the optimizer will be able to tell that SPLIT_STRING on a constant input returns deterministic results. The plan is likely to evaluate the split just once, but it does depend on plan shape.

  4. Assuming the splitter function is a multi-statement table-valued function, the result is very likely to be evaluated once. For details, and how you can tell if caching occurs from the execution plan, see:

All that said, you are right to be cautious about all this. The optimizer will frequently be able to evaluate expressions once and reuse the result, but this is not guaranteed, and often requires expert analysis to determine.

Forcing materialization using temporary tables is frequently the most robust solution since it guarantees a particular order and number of evaluations.

Method 2

Every row will evaluate the functions in the where clause.
Research SARGABLE query.
https://www.sqlshack.com/how-to-use-sargable-expressions-in-t-sql-queries-performance-advantages-and-examples/

DECLARE @X VARCHAR(200) = '1,2,3,4'

DECLARE @StringElements TABLE (Snippet VARCHAR(200))
INSERT INTO @StringElements (Snippet)
(SELECT [value] FROM STRING_SPLIT(@X,','))

DECLARE @XBits TABLE (Snippet VARCHAR(200))
INSERT INTO @XBits 
SELECT Value FROM dbo.MySplitterFunction(@X)

DECLARE @ScalarValue INT = dbo.aUserDefinedScalarFunction(4) --AS ScalarValue


SELECT
    *,
    @ScalarValue AS ScalarValue
FROM
   MyTable T
   INNER JOIN dbo.aUserDefineTableFunction(@X) A ON T.SomeID=A.SomeID
WHERE 
  (T.ID1 IS NULL OR T.ID1 IN (SELECT snippet [value] FROM @StringElements))
  AND
  (T.ID2 IS NULL OR T.ID2 IN (SELECT Snippet [Value] FROM @Xbits))

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply