All we need is an easy explanation of the problem, so here it is.
Basic high availability
2 replicas (1 primary, 1 secondary).
DB01 => initial primary.
DB02 => initial secondary
Synchronous commit on both
Both are in synchronized state
There is no listener configured
Cluster type None
When we stop the DB01 (initial & current primary) SQL service using services.msc (simulating a friendly server crash) and then initiate a forced failover on DB02 (initial & current secondary) using:
ALTER AVAILABILITY GROUP [TestHA] FORCE_FAILOVER_ALLOW_DATA_LOSS;
The secondary database comes online, which is what we want.
However, when the DB01 SQL Server service is started again, using services.msc, the DB01 db assumes primary role again.
So currently there are 2 instances readable/writable and out of sync. We were expecting that the initial primary would detect that a secondary has taken the primary role and assume a secondary role or at least be inaccessible so apps cannot work on old data.
The same procedure, but using the deprecated mirror setup, does behave this way.
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Since this is a clusterless (read-scale) availability group, there is nothing automatically coordinating which role each node is in – that process is completely manual.
This is why the former primary comes back up as the primary – nothing has told it to change its role.
You’ll want to follow the instructions outlined here:
…if the original primary replica recovers after failover, it will assume the primary role. To avoid having each replica be in a different state, remove the original primary from the availability group after a forced failover with data loss. Once the original primary comes back online, remove the availability group from it entirely.
In the end, you can add that former primary back as a secondary manually:
- (Optional) If desired, you can now add N1 back as a new secondary replica to the availability group AGRScale.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂