Risks associated with automatically shrinking TempDB as a SQL agent job?

All we need is an easy explanation of the problem, so here it is.

I have a situation where I have a constraint on TempDB to not exceed 350 GB in size. I’m using autogrowth 10 (percent) on TempDB, as I’ve read that it is supposed to be best practice on databases exceeding 500 MB.

When I used

 dbcc checkdb with estimateonly 

on every database, I’ve noticed that the estimated requirement on TempDB is much lower than I currently have. Summing the values returned by dbcc checkdb with estimateonly gives a value of about 75 GB.

TempDB’s size is currently 300 GB. I can’t increase the permitted size of TempDB any further.

I’ve heard that our Company previously had a SQL job which shrinked TempDB automatically if it exceeded some value, but it is not used in our new environment. I am conflicted however, that being too liberal with shrinking TempDB may cause issues.

Using the information from this post, is there any major risks associated with autoshrinking TempDB to the size of, say, 200 GB if the current size exceeds 200 GB or so, seeing that dbcc checkdb with estimateonly returned an estimated minimum size of 75 GB which would lead this to a rather high relative buffer?

I’ve read that is not the optimal practice to shrink TempDB, but may it be more “legimate” in this instance in reference to the estimated minimum size of TempDB?

EDIT

I’ve made the following Query which shows, for instance, internal object MB space, internal object dealloc MB space, statement text (the Query), total elapsed time, is_user_process:

WITH task_space_usage AS (
    SELECT dmv_tsu.session_id,
           dmv_tsu.request_id,
           SUM(dmv_tsu.internal_objects_alloc_page_count) AS alloc_pages,
           SUM(dmv_tsu.internal_objects_dealloc_page_count) AS dealloc_pages,
           dmv_es.login_name,
           dmv_es.host_name,
           dmv_es.program_name,
           dmv_es.is_user_process,
           dmv_es.total_elapsed_time,
           er.start_time,
           er.estimated_completion_time,
           er.command
    FROM sys.dm_db_task_space_usage dmv_tsu WITH (NOLOCK)
    INNER JOIN sys.dm_exec_sessions dmv_es
ON (dmv_tsu.session_id = dmv_es.session_id)
    INNER JOIN sys.dm_exec_requests er ON dmv_tsu.session_id = er.session_id and dmv_tsu.request_id = er.request_id 
    WHERE dmv_tsu.session_id <> @@SPID
    GROUP BY dmv_tsu.session_id, dmv_tsu.request_id,dmv_es.login_name,dmv_es.host_name,dmv_es.program_name,dmv_es.is_user_process,
           dmv_es.total_elapsed_time,er.start_time,
           er.estimated_completion_time,
           er.command
)
SELECT TSU.session_id,
       TSU.alloc_pages * 1.0 / 128 AS [internal object MB space],
       TSU.dealloc_pages * 1.0 / 128 AS [internal object dealloc MB space],
       EST.text,
       ISNULL(
           NULLIF(
               SUBSTRING(
                 EST.text, 
                 ERQ.statement_start_offset / 2, 
                 CASE WHEN ERQ.statement_end_offset < ERQ.statement_start_offset 
                  THEN 0 
                 ELSE( ERQ.statement_end_offset - ERQ.statement_start_offset ) / 2 END
               ), ''
           ), EST.text
       ) AS [statement text],
       EQP.query_plan,
       TSU.host_name,
       TSU.login_name,
       TSU.program_name,
       TSU.is_user_process,
       TSU.total_elapsed_time,
       TSU.start_time,
           TSU.estimated_completion_time,
           TSU.command
FROM task_space_usage AS TSU
INNER JOIN sys.dm_exec_requests ERQ WITH (NOLOCK)
    ON  TSU.session_id = ERQ.session_id
    AND TSU.request_id = ERQ.request_id
OUTER APPLY sys.dm_exec_sql_text(ERQ.sql_handle) AS EST
OUTER APPLY sys.dm_exec_query_plan(ERQ.plan_handle) AS EQP
WHERE EST.text IS NOT NULL OR EQP.query_plan IS NOT NULL
ORDER BY 10 DESC;

My question is:

Is it more beneficial to have a Query like this, in combination with a SQL agent job which says: `

If TempDB > maxSize then CheckTheQueryPosted, terminate all queries which is a user process

I know this is not code, but I meant something in the lines of this concept?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

There are a lot of common misconceptions when it comes to tempdb and shrinking databases.

For starters, if tempdb grows beyond 350 GB, it’s because it needs to. With that kind of size growth, I’d say there’s an ETL job or manual batches that aren’t behaving well. You could fix that in a number of ways, the best of which are (matter of my opinion)

  • Fix the queries that are eating up your tempdb: look for huge Hash Match or Sort operations without proper indexes or even row versioning if you’re doing Read Committed Snapshot Isolation.
  • Add more disk space. In most environments, this is just a matter of giving your SAN admin a call.

Normally, I would recommend putting tempdb on its own disk(s) and size the database from the start, so you’ll avoid the autogrow thing entirely. Autogrowing by 10 percent is often really bad practice for two reasons:

  • When the database is small, autogrowth can be very frequent, and also not aligned with your physical storage sectors
  • With larger databases, the files will grow by quite a lot once they do.

Another issue with shrinking tempdb is that SQL Server keeps a lot of stuff in there, preventing it from shrinking. You’ll find that even if you try to shrink tempdb, it just won’t, unless you pretty much restart the server (either literally or by clearing all sorts of buffers, both of which can have a dramatic impact on your production environment).

In summary

  • Don’t autoshrink, make sure there’s enough space from the start.
  • Consider sizing tempdb from the start so it won’t have to autogrow.
  • See if you can tune those queries that are filling your tempdb. They’ll probably also be the ones that take the most time to run.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply