How do I change the Cassandra topology + snitch in a multi AZ cluster?

All we need is an easy explanation of the problem, so here it is.

I have a 9 nodes Cassandra v3.11 cluster in one availability zone with SimpleSnitchStrategy.
I want to change the snitch strategy to GossipingPropertyFileSnitch and the topology to be multi AZ like this:
3 nodes in AZ-A,
3 nodes in AZ-B,
3 nodes in AZ-C.

How should I do the migration?
Move the nodes first and change the snitch strategy or
change the snitch strategy and then move the nodes?

As a bonus question, should I use
JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address_first_boot=ip_address"

or just remove the nodes and add them again?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Overview

If it were me, I would first update the existing nodes to GPFS. This makes it so much easier to work with the cluster because GPFS is flexible.

Then I would add the new nodes in the new AZs to the cluster. Let’s assume all 9 existing nodes are in zone A. Add 3 nodes in zone B, then 3 nodes in zone C so you temporarily end up with 15 nodes.

Finally, decommission any of the 6 nodes from zone A so you are back to 9 nodes as planned. The reason I would decommission as the last step is to make sure there is enough capacity in the cluster to minimise the risk of service interruptions.

Seeds

As a general rule, we recommend at least two seed nodes from each DC preferably the nodes are from different racks/AZs. In your case where you only have one DC, pick one node from each AZ to use as a seed node.

Once all new nodes are added to the cluster and old nodes are decommissioned, update the cassandra.yaml on all nodes so they all have the same seeds but note that it is NOT necessary to restart immediately since the nodes periodically reload the seeds from the yaml.

Switching to GPFS

Changing the snitch from SimpleSnitch to GossipingPropertyFileSnitch is the easy part. The high-level steps are:

  1. Update cassandra-rackdc.properties on all nodes.
  2. Update endpoint_snitch in cassandra.yaml on all nodes.
  3. Perform a rolling restart, one node at a time.

For details, see Switching snitches in Cassandra.

Decommissioning nodes

For details, see Removing a Cassandra node. Cheers!

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply