Debugging a hanging session/lock

All we need is an easy explanation of the problem, so here it is.

We are having a recurring situation where we have 2 one query hanging, and other queries waiting for that one, and our DB grinds to a standstill. I am trying to debug this and am seeking help. The DB is RDS running Postgres 10.6.

I use DBeaver to monitor and admin this DB, and here are the screenshots from the Administer -> Lock Manager in that application. In this case, to resolve this issue, I needed to kill the one session the others are waiting on, but I would like to learn how to debug this deeper to figure out what it is waiting on, and why.

Debugging a hanging session/lock
There are multiple copies of the above query because it is being generated by a very frequent cron job, so that is atleast expected.

Debugging a hanging session/lock
According to RDS -> Performance insights, this query, and a lot others are spending a lot of time at this point in the lock state relation.

I expect I havent given enough information here. Let me know what I can add.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

To debug this problem, first find out what the session that blocks the others is doing:

SELECT state, wait_event_type, wait_event
FROM pg_stat_activity
WHERE pid = 29303;  -- or whoever blocks the others

If the session is in state active, it is a long running query. Use EXPLAIN (ANALYZE, BUFFERS) with the query to find out where it spends its time and how to improve that.

If the state is idle in transactions, you have an application bug that forgot to close a transaction. Since all locks are held until the end of the transaction, this blocks other transactions even though it is doing nothing.

The short-term solution is to use the pg_cancel_backend() function to kill the blocking transactions.

The long-term solution is to fix bugs that keep transactions open and to improve concurrency by using fewer and less heavy locks:

The EXCLUSIVE lock that the transactions in your images place on a table are heavy-handed. In my experience, explicit table locks are hardly ever necessary or indicated. Usually they are a sign that the programmer doesn’t understand database transactions well.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply