Cluster Scale-Out

Adding machines to a TigerGraph cluster, for distributed data and/or HA

Version 2.2 Copyright © 2018 TigerGraph. All Rights Reserved.

Introduction

Cluster expansion allows the user to add new machine nodes to an existing cluster and to redistribute data, while the entire system is offline.

Prerequisites

  1. The current TigerGraph system must be installed in cluster mode, not single-node mode.

  2. The total graph data storage space for the expanded cluster should be at least 3 times as large as the current Gstore disk usage.

    1. During the expansion process, a backup copy of all the graph data files is created, plus additional working space is needed.

    2. To check your existing gstore disk space:

  3. The new nodes are available.

Configure GBAR

Cluster Expansion Workflow

The GBAR utility is used for cluster expansion. If this is your first time using GBAR, you must first run gbar config. See the Backup and Restore guide. For a large system one of the key parameters is backup_core_timeout. The default value is 5 hours. The config script gives guidance on estimating an appropriate value.

Set Up Environment in New Nodes

From the command line, switch to the <tigergraph_root_dir>/pkg_pool/syspre_pkg directory under the TigerGraph root directory (~/tigergraph/pkg_pool/syspre_pkg by default). In this directory, a utility script set_syspre.sh is used to setup environment:

Run ./set_syspre.sh -h to see the usage:

For example, to set up the environment on a new node 192.168.1.6 with sudo user called ubuntu and login key ubuntu_rsa, run the following command:

When done, the environment including system-prerequisites and ssh keys for the TigerGraph system will be set up on the new nodes.

Add New Nodes to Cluster

To expand the cluster, run gbar expand with a list of new nodes in the following format:

For example, the following command adds two nodes to the cluster:

The command above will redistribute the data on all nodes including m6 and m7, so that each node has about the same amount of data.

Error Handling

Advanced Expansion Mode

Advanced expansion configuration options are possible. Contact TigerGraph Support for guidance.

Should any errors occur, GBAR will roll back to the state before node expansion started. As a failsafe, a backup copy of the data is kept, until expansion either succeeds or finishes rollback.