Cluster Repartition

Repartitioning a cluster changes the replication factor and the partitioning factor of your cluster without adding or removing nodes from the cluster. For example, if you have a 12-node cluster arranged with partitioning factor of 4 and replication factor of 3 (called a 4 x 3 cluster), you can repartition it to 6 x 2.

You can also change the replication factor when you expand or shrink a cluster. See Expand a Cluster and Shrink a Cluster.

Cluster repartitioning requires several minutes of cluster downtime. The exact amount of downtime varies depending on the size of your cluster.

1. Before you begin

Requirements
  • Ensure that no loading jobs, queries, or REST requests are running on the cluster.

Advice and Precautions
  • Obtain a few key measures for the state of your data before the expansion, such as vertex counts/edge counts or certain query results. This will be useful in verifying data integrity after the expansion completes.

  • Perform a full backup of your existing system before performing the expansion.

2. Procedure

2.1. Calculate new replication factor

To repartition a cluster, calculate the new replication factor for your cluster. You can calculate the replication factor by dividing the total number of nodes in the cluster by the number of partitions you’d like your cluster to have.

2.2. Repartition the cluster

To repartition a cluster, use the gadmin cluster expand command like below. Use the --ha option to indicate the new replication factor of the cluster. For example, the command below will change the replication factor to 2:

$ gadmin cluster expand --ha 2

The partitioning factor of your cluster will change automatically based on your specified replication factor. Its updated value will be the total number of nodes divided by the replication factor.

If the total number of nodes cannot be fully divided by the replication factor, the remainder nodes will be left idle.

For example, assume you begin with a 5-node cluster with a replication factor of 1 and a partitioning factor of 5. Changing the replication factor to 2 without adding new nodes will change the distribution of your cluster to be 2 x 2, with one node being left idle. To avoid nodes being left idle, ensure that you pick a replication factor that can cleanly divide the total number of nodes you have in the cluster.

2.2.1. Supply a staging location

Extra disk space is required during cluster repartition. If more space is not available on the same disk, you can supply a staging location on a different disk to hold temporary data:

$ gadmin cluster expand --stagingPath /tmp/

If you choose to supply a staging location, make sure that the TigerGraph Linux user has write permission to the path you provide. The overall amount of space required for cluster repartition on each node is (1 + ceiling(oldPartition/newPartition) ) * dataRootSize. oldPartition and newPartition stand for the partitioning factors of the cluster before and after repartition, respectively; dataRootSize stands for the size of the data root folder on the node.

For example, assume you are repartitioning from a 6-node cluster with a replication factor of 2 and a partitioning factor of 3, to a 6-node cluster with a replication factor of 3 and a partitioning factor of 2, and the size of the data root folder on a node is 50GB. You would need more than (1 + ceiling(3/2)) * 50) = 150 GB of free space on the staging path.

2.3. Verify success and delete temporary files

When the repartition completes, you should see a message confirming the completion of the cluster change. The message will also include the location of the temporary files created during the repartition.

Verify data integrity by comparing vertex/edge counts or query results. After confirming a successful repartition, delete the temporary files to free up disk space.