All we need is an easy explanation of the problem, so here it is.
We have a machine with 256 GB RAM.
Max server memory for the SQL Server is set to 180 GB.
Out of 180 GB, SQL Server typically uses:
- Database Cache memory – ~ 140-150 GB
- Plan Cache – ~ 10 GB
- Stolen Server Memory: ~ 30 GB
- Free Memory: ~ 9 GB
- Granted Workspace Memory: this usually pretty low, peaks can be 0.5-1 GB
Buffer Cache hit ratio – hovers over 99.9% all the time.
Page Life Expectancy is pretty high number.
Total Size of Database Data Files – 650 GB.
Rate of data growth is about 500-1500 MB daily (but! older data is deleted every 6-8 months, so basically data files grow much slower than that).
There is a requirement to migrate SQL Server to another machine. Targeting SQL Server 2022 when it comes out.
It is mixed OLTP and OLAP-type workload, same databases used by many applications; most of RAM is used by cached database pages, meaning SQL Server does not have to read it from disk all the time.
It feels like 128 GB for new machine will be pretty much enough, with Max Server Memory set to ~ 110 GB, leaving 18-13 GB for the OS.
Target server will be in Azure VM and there you can’t scale vCPU and RAM independently from each other. A machine with 256 GB RAM will have twice as many cores compared with a 128 GB machine, resulting in a pretty big cost difference. I think reducing the cost compared to what was projected initially, may have benefit for me in the long run so I think it is worth exploring. In Azure you can scale up anytime if needed.
If you were me, how would you prove scientifically to your manager,
that cutting memory in half will not kill SQL Server performance, not blow up Buffer Cache hit ratio or anything like that?
I know that for a DBA, it may look safe to downsize this machine to 128 GB given that workload is not going to change.
But how would you convince a manager based on your experience?
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
This is a hard question to answer/prove without knowing a lot more about your workload than provided.
But, I think an easy way to check before committing to a smaller server would be to just change the Max Memory setting on your current production instance and see what the performance looks like.
Nothing you have described seems like a show-stopper, it could perform just fine with lower memory (just with more disk activity), or it could crash and burn. (Or just crash and burn when the monthly reports are run, hard to say).
If I were faced with this project, I would get the business to buy into this plan and then change the Max Memory setting and see what happens. Maybe reduce by set amounts and wait and see before moving on. Watch it like a hawk, ready to move the number back up if needed, but it’s an online operation and quick to change.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂