All we need is an easy explanation of the problem, so here it is.
As part of dealing with the UK’s GDPA ‘right to be forgotten’ in a small database I want to update the (autoinc) PK of a row in table ‘member’ to an obfuscated value and let it cascade down through the FKs in related tables, such as the dates they held certain positions.
If I then set the names of the member in table ‘member’ to empty strings I can still use my related tables, for example to determine that there was someone holding a certain position at a certain time but I will no longer be able to ascertain either the name or the id of the person who held it, which is what I want.
I have written a simple a UDF obfuscate_id() that takes a positive integer ID and returns a negative ID calculated by combining the id passed with a random number.
FUNCTION obfuscate_id(id INT) RETURNS INT COMMENT 'returns the negative version of the id passed, obfuscated by a random number' BEGIN DECLARE result INT; SET result = (FLOOR(RAND(id)*10000) + id ) * -1; IF result > 0 THEN -- ensures result is -ve even if called with a -ve parameter SET result = result * -1; END IF; RETURN result; END;
And I am using the sql
UPDATE member SET member_id = obfuscate_id(member_id) WHERE member_id = 1234;
However, this might generate a negative id that already exists from a previous update of another member_id. If so I want to set it to something else, maybe by obfuscating the id again as in
SET member_id = obfuscate_id(obfuscate_id(member_id));
Is there a simple way, like ON DUPLICATE KEY … to check if the value I am about to update to already exists and if so update to something else?
I would add that given the size of table ‘member’ and the particular application, even the first update is going to be a very rare occurrence, so the chance of having any obfuscated IDs is extremely slim, let alone duplicate ones, but I’d like to cover the eventuality if I can.
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Wouldn’t it be simpler / easier to just choose a random number who’s range is outside your current range of values in the
member table? E.g. something like
(FLOOR(RAND(id)) * -1) + (SELECT MIN(id) FROM member) (semi-pseudocode)?
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂