Select distinct column values from multiple databases

All we need is an easy explanation of the problem, so here it is.

I have 100 databases from customers who all have the same schema but different content.

Now I wanted to some analysis and start out with running a distinct of one column over all databases. My instance contains these and more.

I think it is close to

EXEC sp_MSforeachdb 'Use ? select distinct [ColumName] from [TableName]'

However, that does not combine them; surrounding this with a select also does not work.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

There were some issues with an earlier version of another answer, so I’m going to put this here, as a slightly different version of the same thing:

  • String aggregation via variable should be avoided, you should use STRING_AGG or FOR XML instead

  • You must use QUOTENAME to quote database names, in case there are characters that need quoting.

    For example, consider what happens if there is a database called My]Database, or My]..SomeTable;DROP DATABASE OtherDatabase; --.

Solution:

DECLARE @DynamicSQL nvarchar(max) =
(
    SELECT STRING_AGG(CAST(N'
SELECT [ColumnName] 
FROM ' + QUOTENAME(D.[name]) + '.SchemaName.TableName
'
        AS nvarchar(max)), N'
UNION
')
    FROM sys.databases
);

PRINT @sql;  -- for testing

EXEC sp_executesql @sql;

Method 2

What you’re looking for is the UNION operator, though I don’t think you can use this with the sp_MSforeachdb procedure. The UNION operator automatically removes duplicates for you.

Easiest solution might be to pre-create a #TempTable and leverage that in your call to sp_MSforeachdb like so:

CREATE TABLE #TempTableName (ColumName DataType);
EXEC sp_MSforeachdb 'Use ? INSERT INTO #TempTableName SELECT [ColumName] FROM [TableName]';

SELECT DISTINCT ColumName
FROM #TempTableName

Method 3

This is the code I ended up using:

IF OBJECT_ID(N'tempdb..#TempTableName') IS NOT NULL
    BEGIN
        DROP TABLE #TempTableName
    END 
GO

CREATE TABLE #TempTableName (C1 nvarchar(2))

EXEC sp_MSforeachdb  'IF ''?''  NOT IN (''tempDB'',''model'',''msdb'',''master'')
BEGIN
 INSERT INTO #TempTableName SELECT DISTINCT [SOMEFIELD] FROM [?].[dbo].[MYTABLE]
END 
'

SELECT DISTINCT C1, count(c1) FROM #TempTableName 
group by c1
order by count(c1)

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply