MySQL InnoDB migrations custom implementation: How to deal with DML statements which trigger a commit in the background?

All we need is an easy explanation of the problem, so here it is.

After some discussion on that topic I can assume that there is a quite frustrating fact about MySQL InnoDB: It does not support (atomic) transactions when it comes to DML.

If you have a database migration with data there is a fairly easy solution to make it either completely fail or finish successfully.

START TRANSACTION;

INSERT INTO orders(orderNumber,orderDate) VALUES (1,'2020-05-31');
INSERT INTO orders(orderNumber,orderDate) VALUES (1,'2020-05-31');

COMMIT;

A transaction is an atomic unit of database operations against the data in one or more databases.

Unfortunately this is not true for the following:

START TRANSACTION;

CREATE TABLE Persons ( PersonID int, LastName varchar(255),FirstName varchar(255));
CREATE TABLE Ducks ( DuckID int, DuckName varchar(255));
CREATE INDEX duckname_index ON Ducks (DuckName varchar(255));

COMMIT;

Each of that statements will create an implicit commit, so if the migration fails in between your MySQL database is broke and half migrated.

From the docs:

Some statements cannot be rolled back. In general, these include data
definition language (DDL) statements, such as those that create or
drop databases, those that create, drop, or alter tables or stored
routines.You should design your transactions not to include such
statements. If you issue a statement early in a transaction that
cannot be rolled back, and then another statement later fails, the
full effect of the transaction cannot be rolled back in such cases by
issuing a ROLLBACK statement.

As we have to implement a custom migration system for a certain software we are wondering now how this could be solved? How does e.g. Symfony (https://symfony.com/) Doctrine (https://www.doctrine-project.org/) solve that internally ?

Ideas:

  1. Solve it on CI/CD level and restore the old database if some error occurs?
    Cons: Sounds really clumsy.

  2. Only allow Migrations with exactly one DML statement and strictly seperate DML and DDL migrations.
    Cons: You will have 10 or maybe hundreds of migration files per production deployment.

Still I hope there is a better way? What is the best practical solution to that problem – if any?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

I think you mean DDL. DML statements are like SELECT/INSERT/UPDATE/DELETE, and I can’t think of one of the DML statements that cause an implicit commit.

If you’re concerned about a crash partway through such a migration, then apply each DDL statement in its own migration step. Typically migration frameworks assign a "version" id to each migration, so they know which migrations have yet to be applied. So if the series of migrations is interrupted, then just re-run the migration tool, and it will figure out where it left off and run the subsequent migrations.

But honestly, most people don’t bother. You should test your DDL statements carefully before committing them to your repo. So the chances of a syntax error or something causing a failure should be near zero.

A logical failure (e.g. can’t add a table because a table of that name already exists) may occur if anyone operates on that database instance without going through the migration system. No automated system can account for all possible chaos introduced by humans. Just make sure your team mates are able to cooperate with the automation.

A crash that interrupts a multi-statement migration is rare. In those cases, you may have to do some manual repair to run the subsequent migrations. That’s inconvenient, but it’s not the end of the world.

Finally, another method of migrations is to describe declaratively what the tables should be at the current state of the system, and let the tool figure out which migrations to apply. This is the strategy of a tool like Skeema, for example.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply