How to Replace a Node in a Cluster
This guide outlines the procedure for replacing a node in a cluster regardless of whether it is an High Availability(HA) cluster. If your system uses High Availability and you do not have a replacement node to use, refer to the documentation on removing a failed node in Removal of Failed Nodes.
Prerequisites
-
TigerGraph is installed with HostName, refer to HostName Installation. Additionally, TigerGraph should be installed on a device that can be unmounted from the machine to be replaced and mounted to the new machine.
-
This procedure applies to TigerGraph versions 3.10.1 and later. For versions prior to TigerGraph 3.10.1, please refer to node replacement in preceding versions.
-
The procedure requires pointing hostname to different IP addresses by changing DNS record. On AWS, this is done by Route 53. For other cloud service providers, it is doable as long as they offer similar DNS web services.
Procedure
-
Create a Linux User on New Nodes:
Create a Linux user with the same username/password as your TigerGraph cluster.
sudo useradd tigergraph sudo passwd tigergraph
-
Prepare Files on New Nodes:
Prepare the following files not in the TigerGraph directory. These can be copied from any live nodes.
~/.ssh/ ~/.tg.cfg ~/.bashrc /etc/security/limits.d/98-tigergraph.conf
-
Stop All Services: Stop all services using the command:
gadmin stop all
Use
gadmin stop all --ignore-errors
if the node fails -
Shut Down Single Node:
Shut down the single node to be replaced.
-
Unmount and Mount Disk:
Unmount the disk containing the TigerGraph directory from the old node and mount it to the new machine. Ensure the disk contains the required folders correctly mounted to the new machine with the same mount point.
gadmin config get System.AppRoot --file ~/.tg.cfg gadmin config get System.DataRoot --file ~/.tg.cfg gadmin config get System.LogRoot --file ~/.tg.cfg gadmin config get System.TempRoot --file ~/.tg.cfg
-
Get the IP Address of the New Node:
Retrieve the IP address of the new node using the following commands:
gadmin config entry System.HostList --file ~/.tg.cfg #to change the node ip gadmin init cluster --skip-stop gadmin init etcd #if etcd node is being replaced