All we need is an easy explanation of the problem, so here it is.
In a database I am working on, there is a table called
persons with about one million rows, and 60 FK (FK) constraints from other tables (some with many millions rows) pointing to it.
If I delete one row from
persons, it takes many minutes, which would not be a problem, but it also keeps the table locked, thus blocking the database for all processes. In the past, this has caused users to report that the system is down.
Of course the situation would greatly improve if I added supporting indexes for all the FKs (currently, only 20 of the 60 tables have it). But many of these FKs are for columns like
modified_by, so all the indexes would have no other purpose, and would degrade performance of the system in the daily operations, only to gain an improvement in an exceptional case.
Before running the DELETE, I already make sure that all the referencing rows have been deleted or updated. I do it manually, since I am strongly against the use of CASCADE.
I am not considering soft-delete because otherwise I would have to change all software that reads the
persons table to skip deleted rows.
Is there a way to (possibly temporarily) change the locking mechanism of the
persons table, so that even if the DELETE takes an hour, it does not impact concurrent processes?
Disabling FKs might be a possibility. The risk is that someone else creates inconsistency while I delete the row, then I cannot re-enable the FKs.
Rows to delete: typically one at a time. Either manual operation or scheduled.
Interesting point: I did not check the execution plan immediately, but apparently, almost all operations cost 0%, except 7 cases of "Clustered index Scan (Clustered)" on the PK of referencing tables; one of them costs 57%, the other from 1% to 16%. I still don’t understand why it should scan a clustered index.
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
the situation would greatly improve if I added supporting indexes for all the FK (currently, only 20 of the 60 tables have it)
You should always index fields used as keys, foreign or otherwise.
This gives the database far more options when analysing how to execute the query and adding them may actually increase the overall performance of your application.
many of these FK are for columns like “modified by”, so all the indexes would have no other purpose
As with [pretty much] everything on Planet Database, it’s a compromise.
Here, you’re balancing data integrity – knowing with certainty who changed something – against speed – not having those extra indexes that slow down inserts.
Does it really matter if you don’t know (with certainty) the user that changed these records? If not, you’re OK as you are. If it does matter – and in lots of cases, it really, really does – then you’re stuck with having the overhead of the indexes.
Before running the DELETE, I already make sure that all the referencing records have been deleted …
With the FK’s in place, you would not be able to delete them anyway, so the values of this activity is dubious.
The question really should be, why is the delete taking an hour?
If you’re only deleting one record out of a million row table and there are no rows in the tables with FK’s back to this table, then what’s taking the time? It should be practically instantaneous.
Slow queries, even deletes, are often caused by missing indexes – are you performing the delete based on the record’s Primary Key field? Or some other combination of [not-indexed] fields?
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂