How to make large “in” queries in C# for an indexed field in MSSQL?

All we need is an easy explanation of the problem, so here it is.

In mssql, how can I run effectively this query:

select id from table where field in (gigantic list of ordered longs)

Table schema:

table:
{
  id primary key
  field long
} index findex (field)

I think this should be really fast (since it’s basically just stepping through two ordered lists of longs). Is there a way to continuously stream the IDs in?

I can do it from a console or using Linqpad/SSMS.

Current (slow) Solution: a console app which constructs many long queries, each with 3k fieldId arguments. The limit on argument number is the problem; how to get around it?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

select id from table where field in (gigantic list of ordered longs)

Queries with long IN lists are slow, as they are parsed and compiled each call.

Either pass the list as a JSON array:

select id from table where field in (select value from openjson(@values))

Or use a Table-Valued Parameter or bulk load a temp table with the values. Ensure there is an index or primary key. The OPENJSON method will likely be slower, but the upside is that it works with all client drivers. TVP and Bulk Insert are not always available except in .NET and Java.

Method 2

The way a query like that is effectively evaluated is,
WHERE field = x1 or field = x2 or field = x3.....
which as your seeing can be very intensive. (Sometimes, the optimizer can use an index if available and will rewrite the query.)

What you can do to rewrite the query is to dump the fields you want into a temp table and then join between them. If you have a version of SQL Server that supports STRING_SPLIT, then you can use something like the below, that takes creates an sp with the long list as a variable, loads it to a temp table, and then joins to your original table.

CREATE PROC MyProc @LongList VARCHAR(MAX)  
AS

DROP TABLE IF EXISTS #X 
CREATE TABLE #X (Field VARCHAR(100) PRIMARY KEY CLUSTERED) -- May Need To Change DataType

INSERT INTO #X (Field)
SELECT DISTINCT Value AS Field FROM string_split(@LongList,',')


SELECT 
T.*

FROM 
MyTable T
INNER JOIN #X X
ON T.Field = X.Field

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply