How to Replace a Node in a Cluster

This guide outlines the procedure for replacing a node in a cluster regardless of whether it is an High Availability(HA) cluster. If your system uses High Availability and you do not have a replacement node to use, refer to the documentation on removing a failed node in Removal of Failed Nodes.

Prerequisites

  • TigerGraph is installed with HostName, refer to HostName Installation. Additionally, TigerGraph should be installed on a device that can be unmounted from the machine to be replaced and mounted to the new machine.

  • This procedure applies to TigerGraph versions 3.10.1 and later. For versions prior to TigerGraph 3.10.1, please refer to node replacement in preceding versions.

  • The procedure requires pointing hostname to different IP addresses by changing DNS record. On AWS, this is done by Route 53. For other cloud service providers, it is doable as long as they offer similar DNS web services.

Procedure

  1. Create a Linux User on New Nodes:

    Create a Linux user with the same username/password as your TigerGraph cluster.

    sudo useradd tigergraph
    sudo passwd tigergraph
  2. Prepare Files on New Nodes:

    Prepare the following files not in the TigerGraph directory. These can be copied from any live nodes.

    ~/.ssh/
    ~/.tg.cfg
    ~/.bashrc
    /etc/security/limits.d/98-tigergraph.conf
  3. Stop All Services: Stop all services using the command:

    gadmin stop all

    Use gadmin stop all --ignore-errors if the node fails

  4. Shut Down Single Node:

    Shut down the single node to be replaced.

  5. Unmount and Mount Disk:

    Unmount the disk containing the TigerGraph directory from the old node and mount it to the new machine. Ensure the disk contains the required folders correctly mounted to the new machine with the same mount point.

    gadmin config get System.AppRoot --file ~/.tg.cfg
    gadmin config get System.DataRoot --file ~/.tg.cfg
    gadmin config get System.LogRoot --file ~/.tg.cfg
    gadmin config get System.TempRoot --file ~/.tg.cfg
  6. Get the IP Address of the New Node:

    Retrieve the IP address of the new node using the following commands:

    gadmin config entry System.HostList --file ~/.tg.cfg #to change the node ip
    gadmin init cluster --skip-stop
    gadmin init etcd  #if etcd node is being replaced