All we need is an easy explanation of the problem, so here it is.
We are getting ready to upgrade our SQL Server 2014 systems to SQL Server 2019. As part of our due diligence we created a workload which we are testing against both systems. What we have observed was the following:
- The query performance on average is about 50% faster on SQL Server 2019 – good!
- The reads are about 40% less on SQL 2019 – good!
- The CPU utilization is about 30% higher on average – not good.
This last point is what causes our concern. Does this means that we have to plan to increase our CPU capacity as part of our migration to SQL Server 2019?
To describe what we are seeing from slightly different angle: when we attempt to ramp up our workload by pushing higher number queries/sec, we are seeing lower throughput on SQL Server 2019 because we max out CPU earlier and start seeing errors as a result.
I hope this makes sense and I wonder if the others had similar experience?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Firstly: Is the CPU use really more for the same amount of work? As performance is 50% faster 33% more work is being done on average in any given period of time, so using 30% more CPU resource cancels down to using the same amount of CPU for the same amount of work, just over a shorter period of time. The fewer page accesses could explain this: the CPU is spending less time waiting for IO to complete between bits of work.
Of course this depends on how you are measuring CPU utilisation – remember that for modern CPUs some readings can be significantly inaccurate (see https://aaron-margosis.medium.com/task-managers-cpu-numbers-are-all-but-meaningless-2d165b421e43 amongst other similar articles). We need more details about how you are making measurements to give truly helpful answers.
we are seeing lower throughput on SQL 2019 because we max out CPU earlier
Again, we need to know how you are measuring throughput. Though as above you may be maxing out the CPU earlier because less IO is needed.
Also: have you compared query plans used between the runs on the different versions? It is possible that some of the queries being submitted are getting less optimal plans under the updated engine – the differences in things like cardinality estimates are usually beneficial but as they are only guestimates can backfire.
and start seeing errors as a result
Never simply report "I got an error". What errors are you seeing?
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂