How to Replace a Node in a Cluster

This guide outlines the procedure for replacing a node in a cluster regardless of whether it is an High Availability(HA) cluster. If your system uses High Availability and you do not have a replacement node to use, refer to the documentation on removing a failed node in Removal of Failed Nodes.

Prerequisites

  • TigerGraph is installed with HostName, refer to HostName Installation. Additionally, TigerGraph should be installed on a device that can be unmounted from the machine to be replaced and mounted to the new machine.

  • This procedure applies to TigerGraph versions 3.10.1 and later. For versions prior to TigerGraph 3.10.1, please refer to node replacement in preceding versions.

  • The procedure requires pointing hostname to different IP addresses by changing DNS record. On AWS, this is done by Route 53. For other cloud service providers, it is doable as long as they offer similar DNS web services.

Procedure

  1. Create a Linux User on New Nodes:

    Create a Linux user with the same username/password as your TigerGraph cluster.

    sudo useradd tigergraph
    sudo passwd tigergraph
  2. Prepare Files on New Nodes:

    Prepare the following files not in the TigerGraph directory. These can be copied from any live nodes.

    ~/.ssh/
    ~/.tg.cfg
    ~/.bashrc
    /etc/security/limits.d/98-tigergraph.conf
  3. Stop All Local Services:

    Stop all local services using the command:

    gadmin stop all --local

    Use gadmin stop all --local --ignore-error if the node fails

  4. Unmount the device on which TigerGraph is installed with umount and the machine can be shut down.

  5. Mount the device to the new machine with the same mount point.

  6. Point the hostname of the machine to be replaced to the IP address of the new machine by updating the record in the DNS web service provided by your cloud service provider. For instance:

    On AWS, assuming the to-be-replaced node’s hostname is ip-172-31-26-200.us-west-2.compute.internal. Initially the DNS server should resolve to its original IP address:

    nslookup ip-172-31-26-200.us-west-2.compute.internal
    ...
    Address: 172.31.26.200

    Assuming the new IP machin’es IP address is 172.31.23.44. After updating the record on AWS Route 53 — Note that you’ll need to create a private hosted zone on Route 53 and create a record before you can update it. The same hostname would resolve to new machine’s IP address:

    nslookup ip-172-31-26-200.us-west-2.compute.internal
    ...
    Address: 172.31.23.44
  7. On the new machine, start all local services:

    Start all local services with:

    gadmin start all --local

    The cluster node replacement is now complete.