How to Replace a Node in a Cluster

This guide outlines the procedure for replacing a node in a non-High Availability (HA) cluster. If your system uses High Availability, refer to the documentation on removing a failed node in Removal of Failed Nodes.

Procedure

  1. Create a Linux User on New Nodes:

    Create a Linux user with the same username/password as your TigerGraph cluster.

    sudo useradd tigergraph
    sudo passwd tigergraph
  2. Prepare Files on New Nodes:

    Prepare the following files not in the TigerGraph directory. These can be copied from any live nodes.

    ~/.ssh/
    ~/.tg.cfg
    ~/.bashrc
    /etc/security/limits.d/98-tigergraph.conf
  3. Stop All Services: Stop all services using the command:

    gadmin stop all

    Use gadmin stop all --ignore-errors if the node fails

  4. Shut Down Single Node:

    Shut down the single node to be replaced.

  5. Unmount and Mount Disk:

    Unmount the disk containing the TigerGraph directory from the old node and mount it to the new machine. Ensure the disk contains the required folders correctly mounted to the new machine with the same mount point.

    gadmin config get System.AppRoot --file ~/.tg.cfg
    gadmin config get System.DataRoot --file ~/.tg.cfg
    gadmin config get System.LogRoot --file ~/.tg.cfg
    gadmin config get System.TempRoot --file ~/.tg.cfg
  6. Get the IP Address of the New Node:

    Retrieve the IP address of the new node using the following commands:

    gadmin config entry System.HostList --file ~/.tg.cfg #to change the node ip
    gadmin init cluster --skip-stop
    gadmin init etcd  #if etcd node is being replaced