SQL Server Always On AG – Is a Passive Secondary Required for Automatic Failover?

All we need is an easy explanation of the problem, so here it is.

It is my understanding that for automatic failover to occur in an Always On AG environment, the replicas just need to be configured with "automatic failover," "synchronous commit" and the environment must have proper quorum (i.e. odd number of voting members).

However, according to the "SQL Server 2019 Always On" textbook by Peter A. Carter, for automatic failover to work, the secondary cannot be active and must be passive. Passive meaning the Readable Secondary setting is set to ‘No’ so it is unavailable for direct connections or read access, and obviously no backups can be done on them either.

Here is the exact statement from the textbook:

"Although we can change the value of Readable Secondary through the
GUI while at the same time configuring a replica for automatic
failover without error, this is simply a quirk of the wizard. In fact,
the replica is not accessible since active secondaries are not
supported when configured for automatic failover."

Source: Pro SQL Server 2019 Administration: A Guide for the Modern DBA

Here’s a screenshot of the "quirk"

SQL Server Always On AG – Is a Passive Secondary Required for Automatic Failover?

Note: According to the textbox, the Readonly Secondary should be set to ‘No’ if you want automatic failover to work. Thinking back, I don’t recall a successful automatic failover occurring with this set up and I believe when an issue occurred, the AG was in a RESOLVING state. Note, we have third server for proper quorum.

I’ve searched through Microsoft’s documentation to confirm whether or not the author’s statement is true but have yet to find anything that explicitly states that the secondary replica must be passive for automatic failover to work. This Microsoft article (Failover and Failover Modes (Always On Availability Groups)) states the conditions required for automatic failover but does not mention needing a passive secondary replica.

So is the author correct and Microsoft has failed to advertise this OR is the author incorrect and automatic failover does work when the secondary replica is active/readable? My objective is to have a highly available environment with automatic failover but do we need to sacrifice offloading read operations in order to accomplish this? Adding another server is not an option.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

You can fail over to a readable secondary. The author is either wrong, or just using some confusing terminology. Reading the exact quote you included, I’d lean towards the former explanation. But I don’t know all the context around that statement.

Here I have an AG – all replicas are configured to be readable when they are in the secondary role, and NODE2 is set up for automatic failover:

SQL Server Always On AG – Is a Passive Secondary Required for Automatic Failover?

Here’s the current state of things on the AG dashboard:

SQL Server Always On AG – Is a Passive Secondary Required for Automatic Failover?

I’ll go ahead and connect directly to NODE2 (sync secondary) and start some queries running:

FROM dbo.A;

WAITFOR DELAY '00:00:01';

GO 100

And then do something to cause an automatic failover to NODE2. In this case, I will kill the SQL Server service on the current primary (NODE1).

Now I’ll check the AG dashboard, and can see that NODE2 automatically became the new primary.

SQL Server Always On AG – Is a Passive Secondary Required for Automatic Failover?

Note that during the failover, my SSMS window running queries on NODE2 got this message:

Msg 596, Level 21, State 1, Line 0
Cannot continue the execution because the session is in the kill state.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command. The results, if any, should be discarded.

Which is normal (clients being disconnected while the database undergoes recovery as part of becoming the primary). I retried the query and it continued running. I wouldn’t have to do that if I were connecting to the listener via read-only intent.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply