Installation, Cluster Configuration and Scale-out, License Activation
Loading...
Loading...
Loading...
Loading...
Installing Single-machine and Multi-machine systems
Version 2.4 - 2.5
This guide describes how to install the TigerGraph platform either as a single node or as a multi-node cluster. Please use the Table of Contents to go to the appropriate section of this guide.
If you are installing the Developer Edition, you can also install a Docker image or a virtual machine (VirtualBox) image. Your- 2.5 welcome email message will direct you to the appropriate resources.
This section is for New Installations. If you are updating from a previous version of the TigerGraph platform, first read the section below on Upgrading an Existing Installation .
Before you can install the TigerGraph system, you need the following:
One or more servers that meets the minimum Hardware and Software Requirements with regard to operating system, memory and hard disk space, as well as enough memory and storage to store your graph data.
sudo or root privilege.
A license key provided by TigerGraph (not applicable to Developer Edition)
A TigerGraph system package .
If your package is a *tar.gz file, you may need to install some software prerequisites.
Use a BASH shell, otherwise there may be installation issues.
If you do not yet have a TigerGraph system package, you can request one at www.tigergraph.com/download/ .
If your package is a *tar.gz file, you also need to insure your machine has the following software prerequisites.
Pre-install these basic Linux utilities on your server, if necessary:
tar
curl
ip
more
uuidgen
crontab
ssh/sshd
netstat
semanage
I f you are installing a cluster, you also need the following:
ntpd
iptables/firewalld
If you will use the password login method (P method) instead of ssh key login method (K method) to install the TigerGraph platform, you will also need the following:
sshpass
The name of your package may vary, depending on the product edition (e.g., developer or enterprise) and the version (e.g., 2.0.1). For the examples here, we will assume the name is tigergraph-x.y.z.tar.gz. Substitute the name of your actual package file.
Extract the package:
2. A folder named tigergraph-<version>-offline (or tigergraph-<version>-developer)
will be created. Change into this folder. To Install with default settings, run the install.sh script with commands:
The installer will ask you a few questions:
Do you agree to the License Terms and Conditions?
What is your license key? (not applicable to Developer Edition)
Do you want to use the default TigerGraph user name or select/create your own?
Do you want to use the default TigerGraph user password or create your own?
Do you want to use the default installation folder or select/create your own?
To see what are the default settings, and to see how customize the installation, read the Installation Options section below.
Some license keys are long – over 100 characters long. If you copy-and-paste the license key, be careful not to accidentally include an end-of-line character.
3. The installer concludes by using 'su' to switch to the tigergraph user account. To confirm correct operation:
1. Try the command gadmin status
If the system installed correctly, the command should report that zk , kafka , dict, nginx, gsql, and Visualization are up and ready. Since there is no graph data loaded yet, gse , gpe , and restpp are not initialized.
2. Try the command gsql --version
4. Basic installation is now finished! Please see Post-Installation Notes below.
The following default settings will be applied if no parameters specified:
The installer will create a user called tigergraph , with password tigergraph .
The root directory for the installation (referred to as <TigerGraph.Root.Dir>) is a folder called tigergraph located in the tigergraph user's home directory, i.e., /home/tigergraph/tigergraph .
The installation can be customized by running command line options with the install.sh script:
TigerGraph cluster configuration enables the graph database to be partitioned and distributed across multiple server nodes in a local network (not available in the Developer Edition). The cluster can either be a physical cluster or a network virtual cluster from a cloud service such as Amazon EC2 or Microsoft Azure.
The installation of TigerGraph 2.x has been validated on Amazon EC2 and Microsoft Azure and on a physical on-premises cluster. For Amazon EC2, please make sure all tcp ports are open among all cluster nodes, otherwise service may not start.
In TigerGraph 2.x, the installation machine can be within or outside the cluster. If outside the cluster, the installation machine should be a Linux machine.
Currently, every machine in the cluster must have a sudo user with the same username and password or SSH key .
To install a high-availability cluster (with at least 3 nodes), please set HA.option to be "true" for non-interactive installation or answer "yes" to HA question for interactive installation.
Do not run the cluster installation script in sudo mode.
During cluster configuration, the user provides the following information:
The IP address for each server node, e.g., 172.30.3.2
The login credentials for the nodes.
Cluster installation begins by the user downloading the TigerGraph software package to any Linux machine in the cluster or with access to the cluster nodes (see notes above). When the user runs the installation script with the cluster option, it will either prompt the user for cluster configuration information described above, or if the user requests non-interactive installation, it will read the configuration information from a file cluster_config.json
located in the same folder with the platform package. The installer then proceeds to install the product on each of the cluster nodes and to configure the cluster.
The two installation methods, interactive and non-interactive, are described below.
In interactive mode, the installer will first ask the same basic questions it asks for single-node installation. It will then ask how many machines are in your cluster. Then it will prompt for the IP addresses of the machines, assigning each machine an alias m1, m2, m3, etc. Next it will ask for sudo user name and credentials information. Last, it will ask the user if they accept some changes to the system. (See non-interactive mode installation below for details about user credentials.) A screenshot of interactive installation is shown below.
For non-interactive mode installation, the user must put all the settings into the file cluster_config.json
before running the installer. This file is in the folder with your install.sh file and other TigerGraph package files.
the two key parameters to set are the following:
nodes.ip Each machine in the cluster is defined as a key:value pair, where the key is a machine alias m1, m2, m3, etc. NOTE: If you chose names other than m1, m2, etc., be sure to list them in alphanumeric order in the config file. The first machine ("m1") has a special role in some cases. Use as many key:value pairs as you need, placing the public IP addresses next to each key. The installer will auto detect the local IP addresses and use them to configure the system. If the installer detects more than one local IP address, it will ask the user to select one for configuration.
nodes.login Two login methods are supported:
SSH with password
SSH with key file
For SSH with password, you must input the sudo/root user and its password. For SSH with key file, you may specify the AWS ec2 key file or other key file. If nothing provided, the installer will use default ssh key file such as ~/.ssh/id_rsa
.
HA.option If enable.HA is set to "true", then the system will be configured for a replication factor of 2. For example, if your cluster has 6 machines, 3 will be used for one copy of the data, and 3 will be used for a replica copy of the data. More advanced configuration is possible after initial setup. See HA Cluster Configuration.
Below is a sample cluster_config.json
file.
The node names (e.g., m1, m2, etc.) MUST be given in alphanumeric order, because the first machine has a special role in some situations. In our documentation we will refer to this machine as m1.
Sometimes you may want further control over configuration details, such as replication factor of individual components, security settings, and others. You may also want to install a new TigerGraph system to match your existing TigerGraph system's setup. TigerGraph supports advanced configuration with the -a
option. This option can be used in either interactive mode or non-interactive mode.
Advanced configuration will override the default configuration, and the related configuration incluster_config.json
.
First, create a configuration file named adv_config.cfg
. You can manually create this file, or if you have an existing TigerGraph system, you can generate a file representing its configuration, with the following command:gadmin --dump-config |grep replicas >> adv_config.cfg
If you manually create it, make sure it's a valid YAML file.
For example, the adv_config.cfg
file below sets up TigerGraph as a 3-node cluster with HA replications factor of 3.
Second, in the installation command, add the -a
option. Once the installation is done, verify the system has the configuration as specified.Cluster Installation Commands
Do not run the cluster installation script with sudo permission.
After you have planned out your cluster configuration, you are ready to run the installer.
Extract the package.
2. A folder named tigergraph-<version>-offline
will be created. Change into this folder. To run cluster installation in interactive mode, use the -c option:
To run cluster installation in non-interactive mode, using the settings in the cluster_config.json
file, use the -c and -n (or merged -cn) options:
By default, non-interactive mode installation does not set up NTP or a firewall. To direct the installation to set them up explicitly, using SETUP_NTP
and SETUP_FILEWALL
environmental variables.
3. The installer concludes by prompting the user to login to node m1 of the cluster and use 'su' to switch to the tigergraph user account. To confirm correct operation:
Try the command gadmin status
from any machine in the cluster.
If the system installed correctly, the command should report that zk , kafka , dict, nginx, gsql, and Visualization are up and ready. Since there is no graph data loaded yet, gse , gpe , and restpp are not
Try the command gsql --version
The gsql
command must be run on node m1 of the cluster because the gsql server is installed on m1 only.
4. Basic installation is now finished! Please see Post-Installation Notes below.
If you installed with the default password, we recommend that you change it now.
To perform additional customization, run gadmin --configure
( must be on node m1 if it is cluster ), followed by gadmin config-apply
. The ' gadmin config-apply
' command must be run on node m1 if it is cluster, since only node m1 contians pkg_poolresources. If you configured one or more items of gpe.servers, gse.servers, restpp.servers, kafka.servers, zk.servers, dictserver.servers, gpe.replicas, or gse.replicas, you must reinstall the package by running command gadmin pkg-install reset
on node m1.
see the appropriate sections of the TigerGraph System Administrators Guide .
If you are a first-time user:
See our GSQL language tutorial for first-timer users: GSQL 101
Start designing, using our visual interface. see the TigerGraph GraphStudio UI Guide .
To see more GSQL examples, see GSQL Demo Examples .
To get answers to common questions, see TigerGraph Knowledge Base and FAQs .
Developer Edition upgrade is not supported
The Developer Edition is not designed for upgrade from one version to another It is not possible to upgrade a Developer Edition installation to Enterprise Edition.
If your specific versions are not listed below, please upgrade by :
Download the latest version of TigerGraph to your system.
Extract the tarball.
Run the TigerGraph.bin file that was extracted from the tarball.
These steps are assuming that v2.1.7 is installed. To upgrade to v2.2 from a version older than v2.1.7 , please upgrade to v2.1.7 first. If the tigergraph username and password have been changed, please have them ready as you will need them in order to update the system.
Download tigergraph-2.2.x-offline.tar.gz with user “tigergraph” and extract the tarball file.
Download the post_upgrade.sh script that is attached here.
Run tigergraph.bin under the same folder to upgrade to 2.2.x
Run the post-upgrade script that was downloaded in step 2 : post_upgrade.sh -u <sudoUser> [-P <sudoPass> | -K <sshKey> ] -p <tigergraphUserPass>
v2.0 can be upgraded to v2.1 Enterprise Edition. The data store format and GSQL language scripts in v2.0 are forward compatible to v2.1.
The data store format between 1.x and 2.x for single servers is forward compatible but not backward compatible. For a single server platform, users can upgrade from 1.x to 2.x without reloading data or recreating the graph schema. Some details of the GSQL language have changed, so some loading jobs and queries will need to be revised and reinstalled.
For a cluster configuration, direct upgrade from 1.x to 2.x is not supported at this time. Users interested in migrating from 1.x to 2.x need to export their data and metadata, install v2.x, and then reload data and metadata, with some small modifications. Please contact support@tigergraph.com for assistance.
Please consult the Release Notes for all the versions between your current version and your target version (e.g., v2.1) to see a summary of specification changes. Contain support@tigergraph.com for assistance.
Verify that your data store is compatible and is eligible for direct update / upgrade.
Review the specification changes and how they may affect your applications (loading jobs and queries).
Stop issuing new commands to your TigerGraph system and allow any operations to complete.
(Recommended) Backup your data, as a precaution.
Follow the procedure at the beginning of this document for installing a new system. The installer will automatically shut down your system and start it again.
Be sure to specify the same username as your current installation. Otherwise, if you use a different user name, it will be treated as a new installation, with an empty graph.
Pay attention to output messages during the installation process which may alert you to additional tasks or checks you should perform.
Run the command gsql to start the GSQL shell. The first time after an update, gsql performs two important operations:
Copies your catalog from your old installation to the new installation .
Compares the files in the backup /dev_<datetime>/gdk/gsql/src folder to the new /dev/gdk/gsql/src folder. Pay attention to any files residing in the old folder but not in the new folder. Review them and copy them to the new folder if appropriate. See the example below.
Revise and reinstall loading jobs and queries as needed.
Adding machines to a TigerGraph cluster, for distributed data and/or HA
Version 2.2 - 2.3 Copyright © 2019 TigerGraph. All Rights Reserved.
Cluster expansion allows the user to add new machine nodes to an existing cluster and to redistribute data, while the entire system is offline.
The current TigerGraph system must be installed in cluster mode, not single-node mode.
The total graph data storage space for the expanded cluster should be at least 3 times as large as the current Gstore disk usage.
During the expansion process, a backup copy of all the graph data files is created, plus additional working space is needed.
To check your existing gstore disk space:
The new nodes are available.
The GBAR utility is used for cluster expansion. If this is your first time using GBAR, you must first run gbar config
. See the Backup and Restore guide. For a large system one of the key parameters is backup_core_timeout
. The default value is 5 hours. The config script gives guidance on estimating an appropriate value.
From the command line, switch to the <tigergraph_root_dir>/pkg_pool/syspre_pkg
directory under the TigerGraph root directory (~/tigergraph/pkg_pool/syspre_pkg
by default). In this directory, a utility script set_syspre.sh
is used to setup environment:
Run ./set_syspre.sh -h
to see the usage:
For example, to set up the environment on a new node 192.168.1.6
with sudo user called ubuntu
and login key ubuntu_rsa
, run the following command:
Firewall check
The firewall configuration on new node must be the same as that on existing nodes. Otherwise, the TigerGraph instances on new nodes may not work properly.
For users using TigerGraph 2.2 with Ubuntu, you must comment out the following block at the beginning of .bashrc
in the tigergraph user's home directory, on every node.
When done, the environment including system-prerequisites and ssh keys for the TigerGraph system will be set up on the new nodes.
To expand the cluster, run gbar expand
with a list of new nodes in the following format:
For example, the following command adds two nodes to the cluster:
The command above will redistribute the data on all nodes including m6 and m7, so that each node has about the same amount of data.
GBAR will run the following checks for each new node:
The number of new nodes must be an integer multiple of max(gpe.replicas, gse.replicas).
Each new node alias must be a valid identifier.
Each new node's IP address must be accessible via ssh from the node where gbar expand
is being run.
If the system does not have a schema or data, it will report a data integrity check error. You may ignore this warning.
Advanced expansion configuration options are possible. Contact TigerGraph Support for guidance.
Should any errors occur, GBAR will roll back to the state before node expansion started. As a failsafe, a backup copy of the data is kept, until expansion either succeeds or finishes rollback.
Version 2.2 - 2.3 Copyright © 2019 TigerGraph. All Rights Reserved.
This guide provides step-to-step instructions for activating or renewing a TigerGraph license, by generating and installing a license key unique to that TigerGraph system. This document applies to both non-distributed and distributed systems. In this document, a cluster acting cooperatively as one TigerGraph database is considered one system.
A valid license key activates the TigerGraph system for normal operation. A license key has a built-in expiration date and is valid on only one system. Some license keys may apply other restrictions, depending on your contract. Without a valid license key, a TigerGraph system can perform certain administration functions, but database operations will not work. To activate a new license, a user first configures their TigerGraph system. The user then collects the fingerprint of the TigerGraph system (so-called license seed) using a TigerGraph-provided utility program. Then the collected materials are sent to TigerGraph or an authorized agent via email or web form. TigerGraph certifies the license based on the collected materials and sends a license key back to the user. The user then installs the license key on their system using another TigerGraph command. A new license key (e.g., one with a later expiration) can be installed on a live system that already has a valid license; the installation process does not disrupt database operations.
If your system is currently using an older string-based license key which does not use a license seed, please contact support@tigergraph.com for the procedure to upgrade to the new system-specific license type .
Note: Before beginning the license activation process, the TigerGraph package must be installed on each server, and the TigerGraph system must be configured with gadmin.
Collect the fingerprint of the whole TigerGraph system using the command tg_ lic_seed , which can be executed on any machine in the system. The command tg_lic_seed packs all the collected data into a local file (named tigergraph_seed). When tg_lic_seed has completed successfully, it outputs the path of the collected data to the console.
Send the tigergraph_seed file to TigerGraph , either through our license activation web portal (preferred) or by email to license@tigergraph.com. If using email, please include the following information:
Company/Organization name
Contract number. If you do not know you contract number, please contact your sales representative or sales@tigergraph.com.
If the contract and license seed are in good order, a new license key file will be certificated and sent back to you.
Copy the license key file to a directory on the TigerGraph system where the TigerGraph linux user has r ead permission .
To install the license key, run command tg_ lic_install , specifying the path to the license key file.
If installation is completed successfully, the message "install license successfully" will be displayed in the console. Otherwise, another message "failed to install license" will be displayed.
After a license key has been installed successfully on a TigerGraph system, the information of the installed license is available via the following REST API:
Version 2.2 - 2.3 Copyright © 2019 TigerGraph. All Rights Reserved.
A TigerGraph system with High Availability (HA) is a cluster of server machines which uses replication to provide continuous service when one or more servers are not available or when some service components fail. TigerGraph HA service provides loading balancing when all components are operational, as well as automatic failover in the event of a service disruption. One TigerGraph server consists of several components (e.g., GSE, GPE, RESTPP). The default HA configuration has a replication factor of 2, meaning that a fully-functioning system maintains two copies of the data, stored on separate machines. In advanced HA setup, users can set a higher replication factor.
An HA cluster needs at least 3 server machines . Machines can be physical or virtual. This is true even the system only has one graph partition.
For a distributed system with N partitions (where N > 1), the system must have at least 2N machines.
The same version of the TigerGraph software package is installed on each machine.
HA configuration should be done immediately after system installation and before deploying the system for database use.
To convert a non-HA system to an HA system, the current version of TigerGraph requires that all the data and metadata be cleared, and all TigerGraph services be stopped. This limitation will be removed in a future release.
In the instructions below, all the commands need to be run as the tigergraph OS user, on the machine designated "m1" during the cluster installation.
Be sure you are logged in as the tigergraph OS user on machine "m1". Before setting up HA or changing HA configuration, the current TigerGraph system must be fully stopped. If the system has any graph data, clear out the data (e.g., with "gsql DROP ALL").
After the cluster installation, create an HA configuration using the following command:
This command will automatically generate a configuration for a distributed (partitioned) database with an HA system replication factor of 2. Some individual components may have a higher replication factor .
Sample output:
If the HA configuration fails, e.g, if the cluster doesn’t satisfy the HA requirements, then the command will stop running with a warning.
In this optional additional step, advanced users can run several "gadmin --set" commands to control the replication factor and manually specify the host machine for each TigerGraph component. The table below shows the recommended settings for each component. See the later example section for different configuration cases.
Example: There is a 3-machine cluster m1, m2 and m3. Kafka, GPE, GSE and RESTPP are all on m1 and m2, with replication factor 2. This is a non-distributed graph HA setup.
Once the HA configuration is done, proceed to install the package from the first machine (named “m1” in the cluster installation configuration).
The table below shows how to setup for the common setups. Note if convert the system from another configuration, must stop the old TigerGraph system first.
Starting from version 2.1, configuring a HA cluster is integrated into platform installation, please check the document for detail.
Follow the instructions in the document to install the TigerGraph system in your cluster.
Component | Configuration Key | Suggested Number of Hosts | Suggested Number of Replicas |
ZooKeeper | zk.servers | 3 or 5 | - |
Dictionary Server | dictserver.servers | 3 or 5 | - |
Kafka | kafka.servers | same as GPE | same as GPE |
kafka.num.replicas | 2 or 3 | 2 or 3 |
GSE | gse.servers | every host | every host |
gse.replicas | 2 | 2 |
GPE | gpe.servers | every host | every host |
gpe.replicas | 2 | 2 |
REST | restpp.servers | every host | every host |
System Goal | Cluster Configuration (number of servers in cluster is X) | How to A,B,C, etc. refer to the Steps in the section above. |
Non-distributed graph with HA | Each server machine holds the complete graph. |
|
Distributed graph without HA | Graph is partitioned among all the cluster servers. |
|
Distributed graph with HA | Graph is partitioned with replica factor N. Number of partitions Y equals X/N. |
|