All we need is an easy explanation of the problem, so here it is.
I just wanted to ask a clarification about something on the MSDN from the “Optimizing MERGE Statement Performance” page.
I am working with a data warehouse that takes in records from many different databases and stores the data. All of the tables in my warehouse database basically follow this same pattern:
CREATE TABLE Foo ( database_guid UNIQUEIDENTIFIER ,FooPk BIGINT ,Bar NVARCHAR(20) ,Qix NCHAR(10) ,CONSTRAINT [PK_Foo] PRIMARY KEY CLUSTERED ( database_guid ASC ,FooPk ASC ) ) GO CREATE PROCEDURE [iv].[LoadSomeTable] @databaseGUID UNIQUEIDENTIFIER AS BEGIN SET NOCOUNT ON MERGE Foo USING #FooStaging AS Source ON Foo.FooPk = Source.FooPk AND Foo.database_guid = @databaseGUID WHEN MATCHED THEN UPDATE SET Bar = Source.Bar ,Qix = Source.Qix WHEN NOT MATCHED THEN INSERT (database_guid, FooPk, Bar, Qix) VALUES (@databaseGUID, FooPk, Bar, Qix); END GO CREATE TABLE #FooStaging ( FooPk BIGINT ,Bar NVARCHAR(20) ,Qix NCHAR(10) ) --Data gets loaded in to #FooStaging from a C# call to SqlBulkCopy then calls iv.LoadSomeTable
The thing I am now concerned about is I just read this statement from that MSDN page
Specify only search conditions in the ON
clause that determine the criteria for matching data in the source and
target tables. That is, specify only columns from the target table
that are compared to the corresponding columns of the source table. Do
not include comparisons to other values such as a constant.
After reading that I am thinking I did my query wrong and my merge statement should be
MERGE Foo USING #FooStaging AS Source ON Foo.FooPk = Source.FooPk WHEN MATCHED AND Foo.database_guid = @databaseGUID THEN UPDATE SET Bar = Source.Bar ,Qix = Source.Qix WHEN NOT MATCHED THEN INSERT (database_guid, FooPk, Bar, Qix) VALUES (@databaseGUID, FooPk, Bar, Qix);
But that does not “feel” right to me because the
database_guid field is part of the primary key so shouldn’t it be included with the
on? If I had it in the
WHEN MATCHED and I upload one database with a
1 then I upload a second database with a
FooPk and a different
@databaseGUID I am not sure if
NOT MATCHED will fire or not (Just tested, it does not).
Which way is the correct way to use MERGE?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
I would venture that your best approach is to use separate statements for each of the potential actions, and put them in a serializable transaction. You get to use tried and true statements with no funny semantics or “best practices” violations, and you get to avoid all of the problems I outline in this post, including wrong results bugs and potential index corruption:
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂