Failover Cluster Manager > Role > SQL Server (Node1 is owner) > Resources (SQL Server & SQL Agent)

All we need is an easy explanation of the problem, so here it is.

I created 2 separate WFC, one for database (SQLClustr) and one for application (APPCluster). Each cluster has 2 nodes plus one disk witness in quorum. I can easily failover by rebooting the one of the nodes and the cluster manager perfectly handles the failover.
I have ip for each node, ip for each cluster, and ip for each resource/role.
I am trying to automate the failover, where if I turn off the application service in APPCluster, the other node fails saying that cannot connect to the database server. It hangs.

My question is, what goes in the connection string for the client application, the ip for the SQLCluster or the ip for the SQL resource?

Also, when I manually failover from Node1 to node2, using the the roles pane, the owner node changes and the services and shared disks move over to the node2. But, at the Cluster level, the current Host Server is still node1.

Both clusters are in a VM environment.

v/r;
Jawid

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

My question is, what goes in the connection string for the client
application, the ip for the SQLCluster or the ip for the SQL resource?

The SQL Resource should have an associated Virtual Network Name (or Client Access Name). This resource has an associated IP address that moves between nodes during failover. This is what your connection strings should point to because this Name/IP will always point to the node that is the owner of the SQL instance resource.

Use the following PowerShell to get the list of network names present in a cluster. The one in the Cluster Group cluster group will be the cluster network name and the one in your SQL cluster group will be the SQL network name, and that is the name to use in your connection strings.

Get-ClusterResource | Where-Object { $_.ResourceType -eq "Network Name" }

Also, when I manually failover from Node1 to node2, using the the
roles pane, the owner node changes and the services and shared disks
move over to the node2. But, at the Cluster level, the current Host
Server is still node1.

This is because you’re failing over the clustered SQL Instance but the core cluster resources (cluster name/IP and quorum resource) remain owned by the original owner. It is possible to fail these over as well, but generally isn’t required to be done manually if you’re testing failover of the SQL resource.

Use the PowerShell cmdlet Get-ClusterGroup to get a handle on the different collections of resources in your clusters. When you run that cmdlet against a local cluster you’ll see 3 or more cluster groups returned:

  1. Available Storage – This is the collection of disk resources not yet associated with any cluster resource but available for clustering
  2. Cluster Group – This is the core cluster resources, including the cluster IP, cluster name and quorum resources
  3. <Clustered Resource> – This cluster group (and any additional ones) contains the resources for a clustered service you have deployed, such as SQL Server or your application.

When you perform a manual failover, you’re only moving the cluster group associated with those resources to the new node. An automatic failover may move the Cluster Group if there is a failure of the owner node, but if the automatic failover was SQL related, you may only see the SQL cluster group move to the new node.

Method 2

My question is, what goes in the connection string for the client
application, the ip for the SQLCluster or the ip for the SQL resource?

You should always refer to the Client Access Name or the IP Addresses you see on the SQL Server role properties.

Also, when I manually failover from Node1 to node2, using the the
roles pane, the owner node changes and the services and shared disks
move over to the node2. But, at the Cluster level, the current Host
Server is still node1.

That happens because one is the management for the Cluster and the other is the management for the Role. A role is related to the service that runs on one node of the cluster and it is somehow independent from the management of the Cluster itself. Therefore, sometimes you can see the owner for the Role and the owner for the Cluster not being the same node at some point.

Sometimes you’re gonna see the Quorum disk owned by the node that’s not currently the owner of the other disks.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply