Test to see when table is candidate for compression (row or page)

All we need is an easy explanation of the problem, so here it is.

Does anyone know a tool like this page https://columnscore.com/ where one can determine if a table is a good candidate for row or page compression.

Also I am trying to understand the benefits of compression. I have been reading about it, but still have these questions.

  1. Will compression make more 8KB fit in the buffer cache? When does data become uncompressed i.e. in a select?
  2. will backups become faster?
  3. Is the reason that compression ads cpu overhead because e.g. inserts need to be compressed before the insert?
  4. Is there any good reason to compress tables without investigating how the tables is being used (i.e. static vs. volatile).

A colleague has started to compress many tables (page). I don’t know the analysis behind this decision. I know we sometimes have resource semaphore waits during some peak hours. I actually don’t think he knows what he is doing. I can see that our sos scheduler yield waits have gone slightly up.

I think the compression is causing more harm and it seems that maintenance tasks overshoot more now (e.g. backups, checkdb…).

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Does anyone know a tool like this page https://columnscore.com/ where one can determine if a table is a good candidate for row or page compression.

Tool recommendations are off-topic here, primarily because there can be any number of answers, and they tend to go out-of-date quickly. That said, SQL Server does provide a built-in procedure to estimate space savings from compression, sp_estimate_data_compression_savings. It does not attempt to predict whether compression will be a good option for the workload as a whole, just whether space savings can be expected or not. It is essentially a wrapper around TABLESAMPLE, creating a new object and extrapolating full size from there.

Will compression make more 8KB fit in the buffer cache?

If compression is effective, then yes. Pages remain compressed in the buffer pool.

When does data become uncompressed i.e. in a select?

Data is decompressed when a SQL Server component above the level of the storage engine needs to work with the data. For example, an Index Scan will present decompressed data to a Filter operator in an execution plan.

Test to see when table is candidate for compression (row or page)

The stack trace below shows one example of decompression before presenting data to the query processor (storage engine code highlighted):

Test to see when table is candidate for compression (row or page)

You can read more about this in Compression Strategies by Sunil Agarwal (part of the SQL Server engineering team at the time).

will backups become faster?

Many people have reported that they do. The intuition is that compressed data on persistent storage requires fewer I/Os to read for backup. If your instance is typically CPU-bound, you may see slower backups as the extra processor work needed for decompression outweighs the I/O speedup.

Is the reason that compression ads cpu overhead because e.g. inserts need to be compressed before the insert?

Certainly compression adds CPU overhead, but this is typically manageable because the data compression algorithms were chosen for efficiency rather than ultimate size. Note that data compression and backup compression are very different. Backup compression is based on a variant of gzip compression, and is intended to achieve very good size reduction at the expense of more CPU usage.

Every insert and update will incur at least some data compression overhead. This is more noticeable with PAGE compression. SQL Server may also attempt to recompress an entire page after a certain number of modifications have been made to that page.

Is there any good reason to compress tables without investigating how the tables is being used (i.e. static vs. volatile).

This is subjective, and depends on your local priorities. In general, one might choose to compress without detailed testing where it is clear substantial (and necessary) space savings will be made, and the system has plenty of spare processor time available.

A colleague has started to compress many tables (page). I don’t know the analysis behind this decision.

You should discuss the rationale with your colleague.

Method 2

For your first question there is a system stored procedure you can use :-

sp_estimate_data_compression_savings

https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-estimate-data-compression-savings-transact-sql?view=sql-server-ver15

for your other questions

1 Because the data is compressed you can fit more information into the same memory size, the data is decompressed when it is exchanged with the application. see https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/data-compression?view=sql-server-ver15

2 No, probably not

3 Yes

4 Your question is a leading question, How many things can you think of that you should do without investigation ?

You should probably talk to your colleague if you want to find out why they have done something, there could be many reasons why some tasks are over running.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply