All we need is an easy explanation of the problem, so here it is.
Please understand a little background before getting to know the problem.
TABLE transaction ( t_id: AUTO INCREMENT BIGINT t_execute_on_date: DATE, timestamp: DATETIME, ... other columns )
transaction has 300M (three hundred million) rows and growing (1M rows daily) and growing and we need to drop old data based on the column
t_execute_on_date. I plan to drop old data (older than 2 years. approx 40M rows) on the basis of creation monthly range partitions based on KEY
t_execute_on_date. I have tested this on a test server while using the procedure suggested in @RickJames blog post PARTITION Maintenance in MySQL: create a new partitioned table, with optimised indexes according to queries and then reinserting records, and drop old partitions, and this approach seems fine to me. Any suggestions in this approach are also welcome.
I have done this in master-master replication with one master offline, reorganising data, and recovering replication lag, and then drop the old master, bring the traffic to new one, if everything is fine.
TABLE transaction_history ( th_id: AUTO INCREMENT BIGINT, t_id: FOREIGN KEY (t_id of TABLE transaction) timestamp: DATETIME, ... other columns )
This table has 1.5B+ (One billion five hundred million) rows growing 15M+ daily.
This table has related history of transaction and contains multiple rows corresponding to one
Once old transaction data from table
transaction is dropped, I would like to drop the same from this table
transaction_history as well.
This table does not have any key like
t_execute_on_date based on which I can partition.
This table has 1.5B+ rows I think that adding an additional column
t_execute_on_date is just not possible.
Like the approach discussed above would I have to manually pull data from table
transaction_history based on
t_id of TABLE transaction (reorganised ,old data dropped) and reinserting only the selected records into the new table
The PROBLEM I face is how to purge old data from table
Using MYSQL 5.7
[2.8T HDD, Amazon Linux AMI 2017.03, 16 core, 63GB RAM]
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
FOREIGN KEYS are not allowed in partitioned tables. Simply drop the FKs but keep the indexes that they generated. Presumably, the FK checks are redundant now that you have debugged your app.
adding an additional column t_execute_on_date is just not possible.
PARTITIONs is the optimal way to delete "old" data from the history table, but there are other approaches: http://mysql.rjweb.org/doc.php/deletebig
The one that comes to mind is to have a continually-running job that keeps looking at the "next 1000" rows to see if any of them should be deleted. (It must walk through the table using the
PRIMARY KEY.) It might do a
LEFT JOIN to see which ones are gone, as hinted at in this pseudo-code:
$a = 0; Loop... $z = SELECT th_id FROM transaction_history order by th_id LIMIT 1000,1; exit loop if $z IS NULL SELECT th_id FROM transaction_history AS the LEFT JOIN transaction AS t ON th.t_id = t.t_id WHERE t.t_id IS NULL AND th.th_id >= $a AND th.t_id < $z DELETE those rows -- probably better to do this as a single query $a = $z;
When you exit the Loop, simply start over.
The hope is that this approach to deleting "old stuff" will be fast enough to keep up, though there will be some lag.
Still, I would hope that
gh-ost can turn a non-partitioned table into a partitioned table with minimal downtime, regardless of table size. (No FKs.)
Caution: The optimal indexes for a partitioned table are usually different than the equivalent non-partitioned table. Toss any unused tables when adding partitioning; there is some overhead when building a table with the secondary indexes already in place.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂