All we need is an easy explanation of the problem, so here it is.
I work with SQL Server and face an issue that a query is very slow.
INSERT INTO [ExtractReports].[dbo].[TradeCodesDelete] (PartID,
Code,
CodeTypeID,
SourceTypeID,
RevisionID,
ZPLID,
PartLevel,
CreatedDate,
FlagDelete)
----select count(1) from [ExtractReports].[dbo].[TradeCodesDelete]
---update [ExtractReports].[dbo].[TradeCodesDelete] set FlagDelete=2 where FlagDelete=1
SELECT DISTINCT
FT.PartId,
TN.Code,
FT.CodeTypeID,
FT.SourceTypeID,
FT.RevisionID,
fm.[Value],
FT.PartLevel,
GETDATE(),
1
FROM Parts.Nop_Part pt WITH (NOLOCK)
INNER JOIN Parts.Nop_PartsFamilyAttribute fm WITH (NOLOCK) ON pt.PartsFamilyID = fm.PartFamilyID
AND fm.[Key] = 20281007
INNER JOIN ExtractReports.dbo.TPls pl WITH (NOLOCK) ON pl.ZPLID = fm.Value
INNER JOIN Parts.TradeCodes FT WITH (NOLOCK) ON pt.PartID = FT.PartID
AND FT.PartLevel = 0
INNER JOIN ExtractReports.dbo.TPLNewData TN WITH (NOLOCK) ON TN.PartID = FT.PartID
AND TN.CodeTypeID = FT.CodeTypeID
LEFT JOIN [ExtractReports].[dbo].[TradeCodesDelete] d ON d.partid = FT.partid
AND d.codetypeid = FT.codetypeid
AND d.partlevel = 0
WHERE TN.Code <> FT.Code
AND FT.MappingDoneFlag = 0
AND d.partid IS NULL;
I check the execution plan for a reason of slow query performance.
This is my execution plan: https://www.brentozar.com/pastetheplan/?id=SJCzRrmht
- Index seek is high at 57% – how to reduce it?
- Hash match inner join is high at 40% – how to reduce it?
- Compute scalar is at 1% – how to reduce it?
Can anyone help me?
Table script as below :
CREATE TABLE [Parts].[Nop_PartsFamilyAttribute]
(
[PartFamilyAttributeID] [int] IDENTITY(1,1)
NOT FOR REPLICATION NOT NULL,
[PartFamilyID] [int] NOT NULL,
[Key] [int] NOT NULL,
[Value] [nvarchar](2200) NULL,
[CreatedDate] [datetime] NULL,
[CreatedBy] [int] NULL,
[ModifiedDate] [datetime] NULL,
[Modifiedby] [int] NULL,
[DeletedDate] [datetime] NULL,
[DeletedBy] [int] NULL,
CONSTRAINT [PK_Nop_PartsFamilyAttribute30]
PRIMARY KEY CLUSTERED ([PartFamilyAttributeID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [Customer],
CONSTRAINT [UK_PartFamilyID_Key30]
UNIQUE NONCLUSTERED ([PartFamilyID] ASC, [Key] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [Customer]
) ON [Customer]
what I try :
create index pf_idx on Parts.Nop_PartsFamilyAttribute([Key]) include ([Value])
but nothing changed
still index seek 57
and hash match 40
and compute scalar cost 1
so what I do to solve issue of Execution plan ?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Method 1
anti
Your query has a common anti-pattern in it, using a left join to find rows that don’t exist.
Your query currently looks like this:
<snip>
LEFT JOIN [ExtractReports].[dbo].[TradeCodesDelete] d ON d.partid = FT.partid
AND d.codetypeid = FT.codetypeid
AND d.partlevel = 0
WHERE
<snip>
AND d.partid IS NULL;
You’re not selecting any rows from d.
, which is sensible because assuming d.partid
is a non-nullable/key column, all the resulting columns would be NULL
.
This results in a "late filter" in the query plan, meaning all rows are fully joined together and then the predicate is applied later:
A useful rewrite is to use the NOT EXISTS
syntax for its intended purpose.
SELECT DISTINCT
FT.PartId,
TN.Code,
FT.CodeTypeID,
FT.SourceTypeID,
FT.RevisionID,
fm.[Value],
FT.PartLevel,
GETDATE(),
1
FROM Parts.Nop_Part pt WITH (NOLOCK)
INNER JOIN Parts.Nop_PartsFamilyAttribute fm WITH (NOLOCK) ON pt.PartsFamilyID = fm.PartFamilyID
AND fm.[Key] = 20281007
INNER JOIN ExtractReports.dbo.TPls pl WITH (NOLOCK) ON pl.ZPLID = fm.Value
INNER JOIN Parts.TradeCodes FT WITH (NOLOCK) ON pt.PartID = FT.PartID
AND FT.PartLevel = 0
INNER JOIN ExtractReports.dbo.TPLNewData TN WITH (NOLOCK) ON TN.PartID = FT.PartID
AND TN.CodeTypeID = FT.CodeTypeID
WHERE TN.Code <> FT.Code
AND FT.MappingDoneFlag = 0
AND NOT EXISTS
(
SELECT
1/0
FROM [ExtractReports].[dbo].[TradeCodesDelete] d
WHERE d.partid = FT.partid
AND d.codetypeid = FT.codetypeid
AND d.partlevel = 0
);
aside
I see you talking a lot about costs, and reducing them. Please don’t pay attention to cost. It’s a unit-less, estimated metric that has nothing to do with which parts of your query are slow and inefficient.
Note that when you get an actual execution plan, there are no "actual" costs. They remain estimates, even after SQL Server measures and reports on many other runtime (actual) metrics.
- Why You Should Stop Looking At Query Cost
- Is It Time To Remove Costs From Query Plans?
- Signs Your Execution Plan Is Lying To You: Operator Costs
The optimizer uses costs to choose a good enough plan quickly. Reducing them won’t necessarily get you a faster query. You can have expensive queries that run quickly, and cheap queries that run slowly.
Since you have SQL Server 2017 tagged here, I would suggest looking at operator times to figure out which parts are most inefficient.
Method 2
Create the following index:
create index pf_idx
on Parts.Nop_PartsFamilyAttribute([Key],PartFamilyID)
include ([Value])
The problem currently is that even though it is "seeking", it’s not doing so efficiently, plus it is estimating +4MM rows returned from those seeks into the hash-join, which basically is where all of the time is going. It appears that the Seek is only matching the [Key] value and then relying on the hash-join to do the PartFamilyID filtering, but at 4MM rows it’s a lot of overhead. The index that I suggested above should alleviate much of that by enabling it to more efficiently seek (or maybe even scan) the rows that it needs, matching both [Key] and PartFamilyID.
—-
As @MartinSmith pointed out, you have a much bigger problem with your second join which I missed:
INNER JOIN ExtractReports.dbo.TPls pl WITH (NOLOCK) ON pl.ZPLID = fm.Value
The problem here is that while pl.ZIPID
is an INT, fm.Value
is an NVARCHAR which makes the whole JOIN non-Sargeable.
The solution to this is as follows:
- Assuming that you cannot just change the
Nop_PartsFamilyAttribute.Value
column to INT, then add a persisted calculated column:
->
`ALTER Parts.Nop_PartsFamilyAttribute
ADD IntValue AS (Try_Parse([Value] AS INT)) PERSISTED;`
- Create the following index (instead of the one I proposed above).
->
create index pf_idx2
on Parts.Nop_PartsFamilyAttribute([Key],IntValue,PartFamilyID)
-
Change your second join to:
INNER JOIN ExtractReports.dbo.TPls pl WITH (NOLOCK) ON pl.ZPLID = fm.IntValue
-
Finally, change the column list of your query to this:
->
SELECT DISTINCT
FT.PartId,
TN.Code,
FT.CodeTypeID,
FT.SourceTypeID,
FT.RevisionID,
CAST(fm.IntValue AS Nvarchar(2000)) AS [Value],
FT.PartLevel,
GETDATE(),
1
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0