Loading...
Installation, Cluster Configuration and Scale-out, License Activation
Loading...
Loading...
Loading...
User Privileges and Authentication, LDAP, Single Sign-on
Loading...
Loading...
Loading...
Encryption for Data at Rest and Data in Motion
Loading...
Loading...
Admin Portal, gamin utility, GBAR backup and restore
Loading...
Loading...
Loading...
Loading...
Creation and management of multiple users and roles is available in the Enterprise Edition only.
The TigerGraph platform provides a complete and robust feature set to manage and control user privilege and authentication of GSS operations:
Creation and management of multiple TigerGraph users
Granting to each user a role on a particular graph, each role entailing a set of privileges
Oauth 2.0-style user authentication
Extensible framework, so that additional security- and user- related capabilities can be added in future releases
TigerGraph users exist only with the TigerGraph platform; they are different than operating system users . When the system is first installed, an initial user is automatically created. The default name for this initial user is tigergraph , with password tigergraph . This user has full administrative privilege and can create additional users and can set their privileges (see Roles and Privileges ). For simplicity, we will refer to this initial superuser as the tigergraph user.
If user authentication is enabled (see the section Enabling and Using User Authentication), the TigerGraph system will execute a requested operation only if the requester provides credentials for a user who has the privilege to perform the requested operation.
The TigerGraph system offers two options for credentials.
username-password pair
a token: a unique 32-character string which can be used for REST++ requests, with an expiration date.
When the TigerGraph platform is first installed, user authentication is disabled. The installation process creates a gsql superuser who has the name tigergraph and password tigergraph. As long as user tigergraph's password is tigergraph, gsql authentication remains disabled. This is designed for user convenience in single-user configurations or installations which do not require security, such as demo and training installations. The behavior is compatible with early TigerGraph versions which did not support multiple roles or multiple graphs.
Because there are two ways to access the TigerGraph system, either through the GSQL shell or through REST++ requests, there are two steps needed to set up a secure system with user authentication for both points of entry:
To enable user authentication for GSQL: change the password of the tigergraph user to something other than tigergraph.
To enable Oauth 2 authentication for REST++, use the gadmin program to configure the RESTPP.Authentication parameter. See details below.
More details about each of these two steps are below.
To enable user authentication for GSQL: change the password of the tigergraph user to something other than tigergraph. See ALTER PASSWORD below.
To run a single GSQL command or command file, the user must provide their username and password. The graph also needs to be specified. To specify the username in the command line, use the -u option. The user can also provide their password with the -p option. If the password is not provided on the command line, the system will then prompt the user for their password, so this method is only appropriate for interactive use. If -u not used, then the system will assume that the request is coming from the default tigergraph user. It will then prompt for tigergraph's password (assuming GSQL authentication is enabled). Note that if -u is not used and authentication is disabled, then the system simply responses to all requests, as it did in earlier versions (unprotected administrative mode).
Use the -g parameter to specify which graph on which to operate.
To enter the GSQL interactive shell, simply omit the <command> from the command line. The user does not need to provide credentials again inside the shell. The example below show s two users entering the shell with their passwords. T he user does not need to specify a graph to enter the interactive shell.
The REST++ server implements OAuth 2.0-style authorization as follows: Each user can create one or more secrets (unique pseudorandom strings). Each secret is associated with a particular user and the user's privileges for a particular graph. Anyone who has this secret can invoke a special REST endpoint to generate authorization tokens (other pseudorandom strings). An authorization token can then be used to perform TigerGraph database operations via other REST endpoints. According to OAuth 2.0 protocol, each token will expire after a certain period of time. The TigerGraph default lifetime for a token is 1 month.
Each REST++ request should contain an authorization token in the HTTP header. The REST++ server reads the header. If the token is not valid, REST++ will refuse to run the query and instead will return an authentication error.
The token authentication of REST++ can be turned on by using the following commands:
A user must have a secret before they create a token. Secrets are generated in GSQL (see CREATE SECRET below). The endpoint GET /requesttoken
is used to create a token. The endpoint has two parameters:
secret (required): the user's secret
lifetime (optional): the lifetime for the token, in seconds. The default is one month, approximately 2.6 million seconds.
Once REST++ authentication is enabled, a token should always be included in the HTTP header. If you are using curl to format and submit your REST++ requests, then use the following syntax:
When you use the RUN QUERY command in the GSQL language, this triggers a curl command within the GSQL system. GSQL will automatically use (and generate, if necessary) a token in the curl request for an authorized user.
Authorization for gadmin
Currently, authorization for the gadmin program comes from Linux, and is not related GSQL authorization. In short, only the Linux TigerGraph user can run gadmin.
Details: During installation, the user selects a name and password for the TigerGraph Linux user. The default user and password are tigergraph and tigergraph, respectively. This user is a Linux user; the installer will create a Linux account if needed. Only the TigerGraph Linux user can run gadmin. This Linux user is unrelated to the TigerGraph default user mentioned in the GSQL Authentication section.
The TigerGraph system includes seven predefined roles — superuser, admin, designer, querywriter, queryreader, and observer. Each role has a fixed and logical set of privileges to perform operations. These roles form a hierarchy, with superuser being at the top. Broadly speaking,
An observer (formerly "public") can log on, view the schema and other catalog details for its designated graph, and change their own password.
A queryreader has all observer privileges, and can also run existing loading jobs and queries for its designated graph.
A querywriter has all queryreader privileges, and can also create queries and run data-manipulation commands on its designated graph.
A designer (formerly "architect") has all querywriter privileges, and can modify the schema, create loading jobs for its designated graph.
A globaldesigner has all designer privileges, and can create global schema as well as create objects. Additionally, this role will have the ability to delete graph created by the same user, but will not have the ability to run ‘Clear graph store’ command.
An admin has all designer privileges, and can also create or drop users and grant or revoke roles for its designated graph. That is, an admin can control the existence and privileges of other users on its graph.
A superuser automatically has admin privileges on all graphs, and can also create global vertex and edge types, create multiple graphs, and clear the database.
The detailed permissions for each role are listed in the following table. Except for the superuser and globaldesigner, the scope of privilege is always limited to one's own graph. In some cases, the behavior of the operation depends on one's privilege level. More detailed descriptions of the User Management commands are given later in this document. For details about the Graph Definition, Loading, Querying, and Modification commands, see the GSQL Language Reference documents.
Commands not listed above are by default accessible with at least observer).
The TigerGraph installation process creates one user called tigergraph who has the superuser role. The superuser role has full privilege to perform any action, included creating or removing other users, and assigning roles to the other users. An superuser can create other superusers, who would also have full privilege.
The user tigergraph is permanent. It cannot be dropped by another admin user.
Most of the commands in this section, can be run only by a superuser or an admin user. The exception is SHOW USER. Any user can display their own profile.
If a username contains more than ASCII alphanumeric characters, it is recommended that the name be enclosed in backquote characters, to ensure that the name is treated as a literal string. This applies to the CREATE/DROP USER and GRANT/REVOKE ROLE commands.
Required privilege: superuser, admin Create a new user. GSQL will prompt for the user name and password.
Required privilege: superuser, admin Delete the listed users.
The command takes effect with no warning and cannot be undone.
Required privilege: any Display user's name, role, secret, and token. Non-admin/superuser users see only their own information. Admin/superuser users see information for all users.
Required privilege: superuser, admin Grant a role (or revoke a role) for a user, which add s (or removes) privileges.
The example below grants the queryreader role to two users, revokes it from one of the them (jk), and then grants the querywriter role to both users.
Even if user is granted superuser role, all previous granted roles for the specific user are still displayed.
When user authentication is enabled, the TigerGraph system will execute a requested operation only if the requester provides credentials for a user who has the privilege to perform the requested operation.
The TigerGraph system offers two options for credentials.
user name and password pair.
a token: a unique 32-character string which can be used for REST++ requests. A token expires 1 month from the date of creation by default
The following set of commands are used to create and manage passwords, authentication secrets, and authentication tokens.
Like any other GSQL commands, the user must supply credentials to run these commands. In order to create a secret, the user must supply their password.
Required privilege: any, to change one's own password superuser/admin: to change another user's password
When an admin/superuser user creates a new user, the admin/superuser user sets the user's initial password. Afterward, a user can change their own password.
Moreover, an admin/superuser user can change any user's password. For example, to change hermione's password, the command is ALTER PASSWORD hermione.
The Lightweight Directory Access Protocol ( LDAP ) is an industry-standard protocol for accessing and maintaining directory information services across a network. Typically, LDAP servers are used to provide centralized user authentication service. The Tigergraph system supports LDAP authentication by allowing a TigerGraph user to log in using an LDAP username and credentials. During the authentication process, the GSQL server connects to the LDAP server and requests the LDAP server to authenticate the user.
GSQL LDAP authentication supports any LDAP server that follows LDAPv3 protocol. StartTLS/SSL connection is also supported.
SASL authentication is not yet supported. Some LDAP server are configured to require a client certificate upon connection. Client certificate is not yet supported in GSQL LDAP authentication.
In order to manage the user roles and privileges, the TigerGraph GSQL server employs two concepts—proxy user and proxy group.
A proxy user is a GSQL user created to correspond to an external LDAP user. When operating within GSQL, the external LDAP user's roles and privileges are determined by the proxy user.
A proxy group is a GSQL user group that is used to manage a group of proxy users who share similar properties/attributes in LDAP.
An existing LDAP user can log in to GSQL only when the user matches at least one of the existing proxy groups' criteria. Once the criteria are satisfied, a proxy user will be created for the LDAP user. The roles and privileges of the proxy user are at least as permissive as the proxy group(s) he belongs to. It is also possible to change the roles of a specific proxy user independently. When the roles and privileges of a proxy group changes, the roles and privileges of all the proxy users belonging to this proxy group change accordingly.
To configure a TigerGraph system to use LDAP, there are two main configuration steps:
Configure the LDAP Connection.
Configure GSQL Proxy Groups and Users.
To enable and configure LDAP, run three commands.
Configure LDAP:
The gadmin program will then prompt the user for the settings for several LDAP configuration parameters.
2.Apply the configuration:
3.Restart the gsql server:
An example configuration is shown below.
Below is an explanation of each configuration parameter.
Set to "true" to enable LDAP; "false" to disable LDAP.
Hostname of LDAP server.
Port of LDAP server.
Base DN (Distinguished Name), in order for GSQL to perform the LDAP search.
This specifies the LDAP attribute to search when the GSQL server looks up the usernames in the LDAP server upon login. For example, in the configuration shown above, when a user logs in with the "-u john" option, the GSQL server will search the "uid" attribute in LDAP to find "john" and check the credentials only after "john" is found.
These options are needed when the LDAP server is not publicly readable. In this case, the admin DN and corresponding password need to be specified in order for the GSQL server to connect to the LDAP server.
When set to "none", TigerGraph uses insecure LDAP connection. This can be changed to a secure connection protocol: "starttls" or "ssl".
When starttls or ssl is used, a truststore path as well as its password needs to be configured.
Currently, the TigerGraph system supports two trustore formats: pkcs12 and jks.
When specified, the GSQL server will blindly trust any LDAP sever.
This section explains how to configure a GSQL proxy group in order to allow LDAP user authentication.
A GSQL proxy group is created by the CREATE GROUP command with a given proxy rule. For example, assume there is an attribute called "role" in the LDAP directory, and "engineering" is one of the "role" attribute values. We can create a proxy group with the proxy rule "role=engineering". Different roles can then be assigned to the proxy group. An example is shown below. When a user logins, the GSQL server searches for the user's entry in the LDAP directory. If the user's LDAP entry matches the proxy rule of an existing proxy group, a proxy user is created to which the user will login in.
The SHOW GROUP command will display information about a group. The DROP GROUP command deletes the definition of a group.
Only users with the admin and superuser role can create, show, or drop a group.
Nothing needs to be configured for a proxy user. As long as the proxy rule matches, the proxy user will be automatically created upon login. A proxy user is very similar to a normal user. The minor differences are that a proxy user cannot change their password in GSQL and that a proxy user comes with default roles inherited from the proxy group that they belong to.
Admin_dn is the "distinguished name" of an LDAP entry. In LDAP, "distinguished name" is often abbreviated as dn. When configuring this field, a dn entry with read permission on the ldap directory is expected. Configuring a dn with no read permission will result in an error. Not configuring this field will likely result in an error since the LDAP server is typically not publicly readable. Please note that only the dn field will be accepted for this entry. All other entries will result in an authentication error. The corresponding password for the configured dn should also be set correctly in the configured entry "security.ldap.admin_password ".
It depends on what type of protocol your LDAP server uses. SSL/TLS is very common in enterprise use today. When SSL is used, the port is typically 636 instead of default port 389.
You need to configure the truststore when SSL/TLS is used in the LDAP server. The truststore's path, password, and format need to be configured accordingly. We support two formats—JKS and PKCS12. The JKS is Java KeyStore. The corresponding certificates for the LDAP server need to be imported to the JKS for successful authentication. Different truststore formats are typically interchangeable.
This might be the case if SSL/TLS is enabled from the LDAP server side but you don't have a certificate. You can set "security.ldap.secure.trust_all" to true to bypass the SSL/TLS certificate checking.
"Parameter error" means some of the LDAP configurations are not set properly. Most often it is because admin_dn, admin_password, or the login username and password are not set correctly. Unfortunately, we cannot know exactly what field is wrong because the LDAP server side does not respond back with such detail.
TigerGraph's role-based access control system naturally extends to a multiple graph system: A user is granted a role on a particular graph. The superuser role (new in TigerGraph 1.2) is defined for administration of the entire unified supergraph.
A superuser can create and manage users globally, including creating admin users for local graphs. An admin can create and manage users with their local graph.
The ON GRAPH clause is required unless the role being granted/revoked is superuser.
A user can have more than one role. For example, jk can be a queryreader on the Hogwarts graph and a querywriter on the London graph.
In order to choose and specify your LDAP configuration settings, you must understand some basic LDAP concepts. One reference for LDAP concepts is .
A search filter is optional. When configured, the search is only performed for the LDAP entries that satisfy the filter. The filter must strictly follow LDAP filter format, i.e., the condition must be wrapped by parentheses, etc. A description of the different types of filters is available at . The official specification for LDAP filters is available at .
Congratulations! This means the LDAP is working. However, TigerGraph cannot find a matching rule for the login user. Please create a proxy group for the user. See documents for creating a proxy group .
Command Type
Operations
super- user
admin
global-designer
designer
query- writer
query- reader
observer
Status
Ls
x
x
x
x
x
x
x
User Management
Create/Drop User
x
x
-
-
-
-
-
Show User
x
x
x
x
x
x
x
Alter (Change) Password
x
x
x
x
x
x
x
Grant/Revoke Role
x
x
-
-
-
-
-
Create/Drop/Show Secret
x
x
x
x
x
x
-
Schema Design
Create/Drop Vertex/Edge/Graph
x
-
x
-
-
-
-
Clear Graph Store
x
-
-
-
-
-
-
Drop All
x
-
-
-
-
-
-
Use Graph
x
x
x
x
x
x
x
Use Global
x
x
x
x
x
x
x
Create/Run Global Schema_Change Job
x
-
x
-
-
-
-
Create/Run Schema_Change Job
x
x
x
x
-
-
-
Loading and Querying
Create/Drop Loading Job
x
x
x
x
-
-
-
Create/Interpret/ Install/Drop Query
x
x
x
x
x
-
-
Typedef
x
x
x
x
x
-
-
Offline to Online Job Translation
x
x
x
x
x
-
-
Run Query
x
x
x
x
x
x
-
Run Loading Job
x
x
x
x
x
x
-
Data Modification
Upsert/Delete/ Select Commands
x
x
x
x
x
-
-
The Single Sign-On (SSO) feature in TigerGraph enables you to use your organization's identity provider (IDP) to authenticate users to access TigerGraph GraphStudio and Admin Portal UI.
Currently we have verified the following identity providers which support SAML 2.0 protocol:
For supporting additional IDPs, please inquire sales@tigergraph.com and submit a feature request.
In order to use Single Sign-On , you need to perform four steps :
Configure your identity provider to create a TigerGraph application.
Provide information from your identity provider to enable TigerGraph Single Sign-On .
Create user groups with proxy rules to authorize Single Sign-On users.
Change the password of the tigergraph user to be other than the default, if you haven't done so already.
We assume you already have TigerGraph up and running , and you can access GraphStudio UI through a web browser using the URL:
http://tigergraph-machine-hostname:14240
If you enabled SSL connection, change http to https. If you changed the nginx port of the TigerGraph system, replace 14240 with the port you have set.
Here we provide detailed instructions for identity providers that we have verified. Please consult your IT or security department for how to configure the identity provider for your organization if it is not listed here.
After you finish configuring your identity provider, you will get an Identity Provider Single Sign-On URL , Identity Provider Entity Id , and an X.509 certificate file idp.cert . You need these 3 things to configure TigerGraph next.
After logging into Okta as the admin user, click Admin button at the top-right corner.
Click Add Applications in the right menu.
Click Create New App button in the left toolbar.
In the pop up window, choose SAML 2.0 and click Create .
Input TigerGraph (or whatever application name you want to use) in App Name , and click Next . Upload a logo if you like.
Enter the Assertion Consumer Service URL / Single sign on URL , and SP Entity ID .
Both are URLs in our case. You need to know the hostname of the TigerGraph machine. If you can visit GraphStudio UI through a browser, the URL contains the hostname. It can be either an IP or a domain name.
The Assertion Consumer Service URL , or Single sign on URL, is
http://tigergraph-machine-hostname:14240/sso/saml/acs
The SP entity id URL is:
http://tigergraph-machine-hostname:14240/sso/saml/meta
Scroll to the bottom for Group Attribute Statements. Usually you want to grant roles to users based on their user group. You can give a name to your attribute statement; here we use group . For filter, we want to return all group attribute values of all users, so we use Regex .* as the filter. Click Next after you set up everything.
In the final step, choose whether you want to integrate your app with Okta or not. Then click Finish .
Now your Okta identity provider settings are finished. Click View Setup Instructions button to gather information you will need to setup TigerGraph Single Sign-On.
Here you want to save Identity Provider Single Sign-On URL and Identity Provider Issuer (usually known as Identity Provider Entity Id ). Download the certificate file as okta.cert, rename it as idp.cert , and put it somewhere on the TigerGraph machine. Let's assume you put it under your home folder: /home/tigergraph/idp.cert. If you installed TigerGraph in a cluster, you should put it on the machine where the GSQL server is installed (usually it's the machine whose alias is m1).
Finally, return to previous page, go to the Assignments tab, click the Assign button, and assign people or groups in your organization to access this application.
After logging into Auth0, click Clients in the left navigation bar, and then click CREATE CLIENT button.
In the pop-up window, enter TigerGraph (or whatever application name you want to use) in the Name input box. Choose Single Page Web Application , and then click the CREATE button.
Click Clients again. In the Shown Clients list, click the settings icon of your newly created TigerGraph client.
Scroll down to the bottom of the settings section, and click Show Advanced Settings .
Click the Certificates tab and then click DOWNLOAD CERTIFICATE. In the chooser list, choose CER. Rename the downloaded file as idp.cert , and put it somewhere on the TigerGraph machine. Let's assume you put it under your home folder: /home/tigergraph/idp.cert. If you installed TigerGraph in a cluster, you should put it on the machine where the GSQL server is installed ( usually it's the machine whose alias is m1 ).
Click the Endpoints tab, and copy the text in the SAML Protocol URL text box. This is the Identity Provider Single Sign-On URL that will be used to configure TigerGraph in an upcoming step.
Scroll up to the top of the page, click the Addons tab, and switch on the toggle at the right side of the SAML2 card.
In the pop-up window, enter the Assertion Consumer Service URL in the Application Callback URL input box:
http://tigergraph-machine-hostname:14240/sso/saml/acs
Scroll down to the end of the settings JSON code, click the DEBUG button, and log in as any existing user in your organization in the pop-up login page.
If login in successfully, the SAML response will be shown in decoded XML format. Scroll down to the attributes section. Here you will see some attribute names, which you will use to set proxy rules when creating groups in an upcoming configuration step.
Return to the previous pop-up window and click the Usage tab. Copy the Issuer value. This is the Identity Provider Entity Id that will be used to configure TigerGraph in an upcoming step.
Click the Settings tab, scroll to the bottom of the pop-up window, and click the SAVE button. Close the pop-up window.
According to the SAML standard trust model, a self-signed certificate is considered fine. This is different from configuring a SSL connection, where a CA-authorized certificate is considered mandatory if the system goes to production.
There are multiple ways to create a self-signed certificate. One example is shown below.
First, use the following command to generate a private key in PKCS#1 format and a X.509 certificate file. In the example below, the Common Name value should be your server hostname (IP or domain name).
Second, convert your private key from PKCS#1 format to PKCS#8 format:
Finally, change the certificate and private key file to have permission 600 or less. (The tigergraph user can read or write the file; no other user has any permission.)
From a TigerGraph machine, run the following command: gadmin config entry Security.SSO.SAML
Answering the questions is straightforward; an example is shown below.
Since v2.3, the requirements for the Security.SSO.SAML.SP.Hostname parameter changed. The url must be a full url, starting with protocol (such as http) and ending with port number.
Since v3.1, the requirements for the Security.SSO.SAML.SP.X509Cert and Security.SSO.SAML.SP.PrivateKey parameter changed. The value must be the content of the X.509 certificate and private key respectively.
The reason we change Security.SSO.SAML.ResponseSigned to false is because some identity providers (e.g., Auth0) don't support signed assertion and response at the same time. If your identity provider supports signing both, we strongly suggest you leave it as true.
After making the configuration settings, apply the config changes, and restart gsql.
In order to authorize Single Sign-On users, you need create user groups in GSQL with proxy rules and grant roles on graphs for the user groups.
In TigerGraph Single Sign-On, we support two types of proxy rules. The first type is nameid equations; the second type is attribute equations. Attribute equations are more commonly used because usually user group information is transferred as attributes to your identity provider SAML assertions. In the Okta identity provider configuration example, it is transferred by the attribute statement named group . By granting roles to a user group, all users matching the proxy rule will be granted all the privileges of that role. In some cases if you want to grant one specific Single Sign-On user some privilege, you can use a nameid equation to do so.
For example, if you want to create a user group SuperUserGroup that contains the user with nameid admin@your.company.com only, and grant superuser role to that user, you can do so with the following command:
Suppose you want to create a user group HrDepartment which corresponds to the identity provider Single Sign-On users having the group attribute value "hr-department", and want to grant the queryreader role to that group on the graph HrGraph:
Don't forget to enable User Authorization in TigerGraph by changing the password of the default superuser tigergraph to other than its default value. If you do not change the password, then every time you visit the GraphStudio UI, you will automatically log in as the superuser tigergraph.
Now you have finished all configurations for Single Sign-On. Let's test it.
Visit the GraphStudio UI in your browser. You should see a Login with SSO button appear on top of the login panel:
If after redirecting back to GraphStudio, you return to the login page with the error message shown below, that means the Single Sign-On user doesn't have access to any graph. Please double check your user group proxy rules, and roles you have granted to the groups.
If your Single Sign-On fails with error message show below, that means either some configuration is inconsistent between TigerGraph and your identity provider, or something unexpected happened.
You can check your GSQL log to investigate. First, find your GSQL log file with the following:
Then, grep the SAML authentication-related logs:
Focus on the latest errors. Usually the text is self-descriptive. Follow the error message and try to fix TigerGraph or your identity provider's configuration. If you encounter any errors that are not clear, please contact support@tigergraph.com .
Actual hardware requirements will vary based on your data size, workload and features you choose to install.
*Actual needs (CPU, memory, storage) depend on data size and application requirements. Consult our solution architects for an estimate of memory and storage needs.
Comments:
The TigerGraph system is optimized to take advantage of multiple cores.
Performance is optimal when the memory is large enough to store the full graph and to perform computations.
The platform works excellently as a single node. For high availability or scaling, a multi-node configuration is possible.
The TigerGraph Software Suite is built on 64-bit Linux. It can run on a variety of Linux 64-bit distributions. The software has been tested on the operating systems listed below. When a range of versions is given, it has been tested on the two endpoints, oldest and newest. We continually evaluate the operating systems on the market and work to update our set of supported operating systems as needed. The TigerGraph installer will install its own copies of Java JDK and GCC , accessible only to the TigerGraph user account, to avoid interfering with any other applications on the same server.
Before offline installation, the TigerGraph system needs a few basic software packages to be present.
tar, to extract files from the offline package
curl, an alternative way to send query request to TigerGraph
crontab, a basic OS software module which TigerGraph relies on
ip, to configure the network
ssh/sshd, to connect to the server
more, a tool to display the License Agreement
netstat, a basic OS tool to check the network status
sshpass, if you intend to use password login method (P method) instead of ssh key login method (K method) to install the TigerGraph platform.
If they are not present, contact your system administrator to have them installed on your target system. For example, they can be installed with one of the following commands.
If you are running TigerGraph on a multi-node cluster, you must install, configure and run the NTP (Network Time Protocol) daemon service. This service will synchronize system time among all cluster nodes.
If you are running TigerGraph on a multi-node cluster, you must configure the iptables/firewall rules to make all tcp ports open among all cluster nodes.
In an on-premises installation, the system is fully functional without a web browser. To run the optional browser-based TigerGraph GraphStudio User Interface or Admin Portal, you need an appropriate browser:
Installing Single-machine and Multi-machine systems
This guide describes how to install the TigerGraph platform either as a single node or as a multi-node cluster. Please use the Table of Contents to go to the appropriate section of this guide.
If you signed up for the Enterprise Free license or the Developer Edition, you also have access to the TigerGraph platform as a Docker image or a virtual machine (VirtualBox) image. Follow the instructions in the welcome email message you received.
Before you can install the TigerGraph system, you need the following:
sudo privilege is required. If sudo privilege is not available, please contact TigerGraph support for documented workarounds.
A license key provided by TigerGraph (not applicable to Enterprise Free license or Developer Edition)
A TigerGraph system package .
If your package is a *tar.gz file, you may need to install some software prerequisites.
Use a BASH shell, otherwise there may be installation issues.
Cluster installation requires sudo access and one-time SSH access to start services. For some customers, sudo and/or SSH access might not be feasible. In such circumstances, please reach out to TigerGraph support for help with installation.
If your package is a *tar.gz file, you also need to insure your machine has the following software prerequisites:
Pre-install the basic Linux utilities from the link above on your server, if necessary:
I f you are installing a cluster, you also need the following:
ntpd
iptables/firewalld
For 3.0 version onwards, installer may have issues with using "SSH with password" for EC2 instances. Please use ssh with key file for the time being.
If EC2 machine was created to be accessed via ssh password, please run these commands and continue with the installation:
For Enterprise Edition Free License, please follow the instructions sent in the email to ensure that the pre-installed license is applied automatically.
As specified in the email, users must use the -n (Non-interactive) mode for the installation to pickup the “Preinstalled license”. If the user chooses Interactive Mode, user can copy the license from 'install_conf.json' file and apply it manually at installation time.
The name of your package may vary, depending on the product edition (e.g., developer or enterprise) and the version (e.g., 2.0.1). For the examples here, we will assume the name is tigergraph-x.y.z.tar.gz. Substitute the name of your actual package file.
Extract the package:
2. A folder named tigergraph-<version>-offline (or tigergraph-<version>-developer)
will be created. Change into this folder. To Install with default settings, run the install.sh script with commands:
The installer will ask you a few questions:
Do you agree to the License Terms and Conditions?
What is your license key? (not applicable for Enterprise Free license or Developer Edition)
Do you want to use the default TigerGraph user name or select/create your own?
Do you want to use the default TigerGraph user password or create your own?
Do you want to use the default installation folder or select/create your own?
Do you want to use the default data location folder or select/create your own?
Do you want to use the default log location folder or select/create your own?
Do you want to use the default temp folder or select/create your own?
What is the default SSH port for your machine?
To see what are the default settings, and to see how customize the installation, read the Installation Options section below.
Since license keys are long – over 100 characters long. If you copy-and-paste the license key, be careful not to accidentally include an end-of-line character.
3. After installation is complete, you can login to the tigergraph user with this command : su tigergraph
To confirm correct operation:
1. Try the command gadmin status
If the system is installed correctly and the license is activated, the command should report that all services are up and ready. Since there is no graph data loaded yet, gse and gpe will show "warm up".
2. Try the command gsql --version
The following default settings will be applied if no parameters specified:
The installer will create a user called tigergraph , with password tigergraph .
The default root directory for the installation would be /home/tigergraph/tigergraph with the App/Data/Log/Temp files within it : App Path : /home/tigergraph/tigergraph/app Data Path : /home/tigergraph/tigergraph/data Log Path : /home/tigergraph/tigergraph/log Temp Path : /home/tigergraph/tigergraph/tmp
The root directory for the installation (referred to as <TigerGraph.Root.Dir>) is a folder called tigergraph located in the tigergraph user's home directory, i.e., /home/tigergraph/tigergraph .
The installation can be customized by running command line options with the install.sh script:
The installation can be run customized using three different methods : 1. Interactive mode 2. Command Line options 3. Non-interactive mode with the install_conf.json file
TigerGraph cluster configuration enables the graph database to be partitioned and distributed across multiple server nodes in a local network (not available in the Developer Edition). The cluster can either be a physical cluster or a network virtual cluster from a cloud service such as Amazon EC2 or Microsoft Azure.
The installation of TigerGraph 3.x has been validated on Amazon EC2 and Microsoft Azure and on a physical on-premises cluster. For Amazon EC2, please make sure all tcp ports are open among all cluster nodes, otherwise service may not start.
In TigerGraph 3.x, the installation machine can be within or outside the cluster. If outside the cluster, the installation machine should be a Linux machine.
Currently, every machine in the cluster must have a sudo user with the same username and password or SSH key .
To install a high-availability cluster (with at least 3 nodes), please set ReplicationFactor as you wish. (Default 1 means HA is off, you might set it to be the factor of the number of nodes, i.e. ReplicationFactor = 2 or 3 for 6-node cluster)
For cluster installation, there is no requirement to run installation script with sudo privileges.
During cluster configuration, the user is required to provide the following information regarding the cluster:
The node id (e.g. m1) and its IP address (e.g. 172.30.3.2).
The login credentials for the nodes.
The ReplicationFactor, which has to do with HA setup
In interactive mode, the installer will first ask the same basic questions it asks for single-node installation. It will then ask how many machines are in your cluster. Then it will prompt for the IP addresses of the machines, assigning each machine an alias m1, m2, m3, etc. Next it will ask for sudo user name and credentials information. Last, it will ask the user if they accept some changes to the system. (See non-interactive mode installation below for details about user credentials.) A screenshot of interactive installation is shown below.
For non-interactive mode installation, the user must review and modify all the settings in the file install_conf.json
before running the installer. This file is in the folder with your install.sh file and other TigerGraph package files.
The following are some advanced configuration options:
Node List Each machine in the cluster is defined as a key:value pair, where the key is a machine alias m1, m2, m3, etc. NOTE: If you chose names other than m1, m2, etc., be sure to list them in alphanumeric order in the config file. The first machine ("m1") has a special role in some cases. Use as many key:value pairs as you need, placing the public IP addresses next to each key. The installer will auto detect the local IP addresses and use them to configure the system. If the installer detects more than one local IP address, it will ask the user to select one for configuration. One example of NodeList:
"NodeList": ["m1: 192.168.55.42", "m2: 192.168.55.46", "m3: 192.168.55.47" ]
Note: The entry is a json array of strings, so each key:value pair should be quoted as a string, and be separated by a comma.
Login Config Two login methods are supported:
SSH with password
SSH with key file
For SSH with password, you must input the sudo/root user and its password. For SSH with key file, you must specify the AWS EC2 key file or other key file by its absolute path.
Replication Factor If you would like enable the HA feature, please make sure you have at least 3 nodes in the cluster and set the replication factor >= 2. For example, if your cluster has 6 nodes, you could set the replication factor to be 2 or 3. If you set the replication factor to be 2, then 3 nodes will be used for one copy of the data and the other 3 nodes will be used as a replica copy of the data. Reminder: Set replication factor as the factor of the number of nodes to maximize the HA benefit. Otherwise, some nodes may not be utilized as part of the HA cluster.
Please refer to install_conf.json
file in the installation package for details on configuration format. Users are advised to always use the copy that comes with the installation package as the installer is guaranteed to be compatible with it.
In install_conf.json,
the node names (e.g., m1, m2, etc.) MUST be given in alphanumeric order, because the first machine has a special role in some situations. In our documentation we will refer to this machine as m1.
If you installed with the default password, we recommend that you change it now.
To perform additional customization, run gadmin --configure
( must be on node m1 if it is cluster ), followed by gadmin config-apply
. The ' gadmin config-apply
' command must be run on node m1 if it is cluster, since only node m1 contians pkg_poolresources. If you configured one or more items of gpe.servers, gse.servers, restpp.servers, kafka.servers, zk.servers, dictserver.servers, gpe.replicas, or gse.replicas, you must reinstall the package by running command gadmin pkg-install reset
on node m1.
If you are a first-time user:
Beginning with v3.0 of TigerGraph, system upgrades will be done using the installation script.
Download the latest version of TigerGraph to your system.
Extract the tarball.
Run the install script with the upgrade flag (-U) that was extracted from the tarball :
./install.sh -U
In-place cluster migration of schema and data is an important aspect of business continuity. TigerGraph supports full migration between compatible versions through simple upgrade operation. However there are some special considerations when migrating to TigerGraph 3.0 version as noted below.
Please contact TigerGraph Support to coordinate migration to TigerGraph 3.0 version. Even though all the steps for Migration are documented on this page, it is strongly recommended that you review the migration process with TigerGraph Support team
Migration from pre-3.0 versions (TigerGraph 2.4, 2.5 and 2.6 ) to 3.0 is supported. To migrate from versions prior to 2.4, users are advised to contact TigerGraph Support to check the feasibility of support the range of earlier versions.
Please be sure to take a backup of data on the cluster before starting the migration process.
Developer Edition upgrade is not supported
The Developer Edition is not designed for upgrade from one version to another It is not possible to upgrade a Developer Edition installation to Enterprise Edition.
If you have written User-Defined Functions for your queries, be sure to make a backup of these files : <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprUtil.hpp
Make sure all data is consumed and no active jobs are running. The following instructions will guide you to force all KAFKA data to be consumed for each graph. For graphs requiring an authentication token, the endpoint must be called for each graph utilizing their respective tokens. For graphs not requiring tokens, you can call the endpoint once for all graphs.
/rebuildnow endpoint The /rebuildnow endpoint is to be called to force the engine to do a rebuild for the graphs. It is a non-blocking url call where user can do the query as well as loading during the rebuildnow call. The endpoint takes in three optional additional parameters (with example below) :
threadnum: a parameter used to control the number of threads used to do the rebuild. If not specified then it uses the default threadnum in gium.
vertextype: the vertex type name that used to do rebuild only for this type of vertices.
segid: a list of parameters used to specify which segments get the rebuild. If not specified then it do the rebuild for all segments.
path: path to write the summary file on each machine in the cluster. This can be used to indicate that the rebuild has finished and it also records the summary of the rebuild on each machine. The default path will be /tmp/rebuildnow.
The query will output two files into the given path parameter, i.e. create the init.summary.txt at the beginning of running to record all segment info and output finished.summary.txt at the end of the query running so that people know when the rebuild are all finished. Below is an example of running the CURL request and output.
NOTE : The /rebuildnow endpoint does not guarantee that all KAFKA messages are consumed by the engine (you have to wait until there is no PullDelta in the GPE log to guarantee that all KAFKA messages have been consumed by the engine). It only guarantees that all the in-memory graph updates are persisted to disk data.
Make sure config is applied by using this command : gadmin config-apply
.
Stop your TigerGraph system with this command : gadmin stop all admin ts3 -y
.
Install version 3.0.0 with the same cluster config and HA options as your previous installation.
If you have enabled HA in your 2.5.x installation, you should specify the ReplicationFactor in 3.0.0 installer to be the same as previously configured. Otherwise, leave it as 1.
NOTE: If your old 2.5.x system is installed in the cluster [m1, m2, m3, m4], you could only install 3.0.0 in the same [m1, m2, m3, m4], but you only need to maintain the IP of m1 to be the same. The order of m2 to m4 does not matter.
Please specify a valid license key.
After installing, log in as tigergraph user. Now gadmin version
should point to 3.0.0. If not, please check your installation.
For the following instructions, we assume to be under tigergraph user:
If any errors occur, please check the error message, as well as debug.log under the migration tool folder.
Notice: If you don’t activate a valid license when installing 3.0, you might fail in the end when running these two commands.
gsql recompile loading job
gsql install query -force all
All sections below are for versions prior to v3.0. If your specific versions are not listed below, please upgrade by :
Download the latest version of TigerGraph to your system.
Extract the tarball.
Run the TigerGraph.bin file that was extracted from the tarball :
bash tigergraph.bin
These steps are assuming that v2.1.7 is installed. To upgrade to v2.2 from a version older than v2.1.7 , please upgrade to v2.1.7 first. If the tigergraph username and password have been changed, please have them ready as you will need them in order to update the system.
Download tigergraph-2.2.x-offline.tar.gz with user “tigergraph” and extract the tarball file.
Run tigergraph.bin under the same folder to upgrade to 2.2.x
Run the post-upgrade script that was downloaded in step 2 : post_upgrade.sh -u <sudoUser> [-P <sudoPass> | -K <sshKey> ] -p <tigergraphUserPass>
v2.0 can be upgraded to v2.1 Enterprise Edition. The data store format and GSQL language scripts in v2.0 are forward compatible to v2.1.
The data store format between 1.x and 2.x for single servers is forward compatible but not backward compatible. For a single server platform, users can upgrade from 1.x to 2.x without reloading data or recreating the graph schema. Some details of the GSQL language have changed, so some loading jobs and queries will need to be revised and reinstalled.
Verify that your data store is compatible and is eligible for direct update / upgrade.
Review the specification changes and how they may affect your applications (loading jobs and queries).
Stop issuing new commands to your TigerGraph system and allow any operations to complete.
(Recommended) Backup your data, as a precaution.
Follow the procedure at the beginning of this document for installing a new system. The installer will automatically shut down your system and start it again.
Be sure to specify the same username as your current installation. Otherwise, if you use a different user name, it will be treated as a new installation, with an empty graph.
Pay attention to output messages during the installation process which may alert you to additional tasks or checks you should perform.
Run the command gsql to start the GSQL shell. The first time after an update, gsql performs two important operations:
Copies your catalog from your old installation to the new installation .
Compares the files in the backup /dev_<datetime>/gdk/gsql/src folder to the new /dev/gdk/gsql/src folder. Pay attention to any files residing in the old folder but not in the new folder. Review them and copy them to the new folder if appropriate. See the example below.
Revise and reinstall loading jobs, user-defined functions, and queries as needed.
Clicking the button will navigate to your identity provider's login portal. If you have already logged in there, you will be redirected back to GraphStudio immediately. After about 10 seconds, the verification should finish, and you are authorized to use GraphStudio. If you haven't login at your identity provider yet, you will need to log in there. After logging in successfully, you will see your Single Sign-On username when you click the User icon at the upper right of the GraphStudio UI.
Additionally, we have ready-to-use virtual machine images on , , and .
This section is for New Installations. If you are updating from a previous version of the TigerGraph platform, first read the section below on .
One or more servers that meets the minimum with regard to operating system, memory and hard disk space, as well as enough memory and storage to store your graph data.
If you do not yet have a TigerGraph system package, you can request one at .
4. Basic installation is now finished! Please see below.
see the appropriate sections of the .
See our GSQL language tutorial for first-timer users:
Start designing, using our visual interface. see the .
To see more GSQL examples, see .
To get answers to common questions, see .
Migration process instructions are documented in
Download the post_upgrade.sh script that is attached .
For a cluster configuration, direct upgrade from 1.x to 2.x is not supported at this time. Users interested in migrating from 1.x to 2.x need to export their data and metadata, install v2.x, and then reload data and metadata, with some small modifications. Please contact for assistance.
Please consult the Release Notes for all the versions between your current version and your target version (e.g., v2.1) to see a summary of specification changes. Contain for assistance.
Component | Minimum | Recommended |
CPU* | 4 cores for <500MB data, 8 cores for >500MB data (64-bit processor) | 16+ cores (64-bit processors) |
Memory* | 8 GB | ≥ 64GB |
Storage* | 20 GB | ≥ 1TB, RAID10 volumes for better I/O throughput. SSD storage is recommended. |
Network | 1 Gigabit Ethernet adapter | 10Gigabit Ethernet adapter for inter-node communication |
On-Premises hosting | Java JDK version | GCC version (C/C++) |
RedHat 6.5 to 6.9 (x64) | Yes | 1.8.0_141 | 4.8.2 |
RedHat 7.0 to 7.8 (x64) | Yes | 1.8.0_141 | 4.8.2 |
RedHat 8.0 to 8.2 (x64) | Yes | 1.8.0_141 | 4.8.2 |
Centos 6.5 to 6.9 (x64) | Yes | 1.8.0_141 | 4.8.2 |
Centos 7.0 to 7.4 (x64) | Yes | 1.8.0_141 | 4.8.2 |
Centos 8.0 to 8.2 (x64) | Yes | 1.8.0_141 | 4.8.2 |
Ubuntu 14.04 LTS Ubuntu 16.04 LTS Ubuntu 18.04 LTS (x64) | Yes | 1.8.0_141 | 4.8.4 |
Debian 8 (jessie) | Yes | 1.8.0_141 | 4.8.4 |
Browser | Chrome | Safari | Firefox | Opera | Edge | Internet Explorer |
Supported version | 54.0+ | 11.1+ | 59.0+ | 52.0+ | 80.0+ | 10+ |
This guide covers two advanced license issues:
Activating a System-Specific License
Usage limits enforced by certain license keys
This section provides step-to-step instructions for activating or renewing a TigerGraph license, by generating and installing a license key unique to that TigerGraph system. This document applies to both non-distributed and distributed systems. In this document, a cluster acting cooperatively as one TigerGraph database is considered one system.
A valid license key activates the TigerGraph system for normal operation. A license key has a built-in expiration date and is valid on only one system. Some license keys may apply other restrictions, depending on your contract. Without a valid license key, a TigerGraph system can perform certain administration functions, but database operations will not work. To activate a new license, a user first configures their TigerGraph system. The user then collects the fingerprint of the TigerGraph system (so-called license seed) using a TigerGraph-provided utility program. Then the collected materials are sent to TigerGraph or an authorized agent via email or web form. TigerGraph certifies the license based on the collected materials and sends a license key back to the user. The user then installs the license key on their system using another TigerGraph command. A new license key (e.g., one with a later expiration) can be installed on a live system that already has a valid license; the installation process does not disrupt database operations.
If your system is currently using an older string-based license key which does not use a license seed, please contact support@tigergraph.com for the procedure to upgrade to the new system-specific license type .
Note: Before beginning the license activation process, the TigerGraph package must be installed on each server, and the TigerGraph system must be configured with gadmin.
Collect the fingerprint of the whole TigerGraph system using the command tg_ lic_seed , which can be executed on any machine in the system. The command tg_lic_seed packs all the collected data into a local file (named tigergraph_seed). When tg_lic_seed has completed successfully, it outputs the path of the collected data to the console.
Send the tigergraph_seed file to TigerGraph , either through our license activation web portal (preferred) or by email to license@tigergraph.com. If using email, please include the following information:
Company/Organization name
Contract number. If you do not know you contract number, please contact your sales representative or sales@tigergraph.com.
If the contract and license seed are in good order, a new license key file will be certificated and sent back to you.
Copy the license key file to a directory on the TigerGraph system where the TigerGraph linux user has r ead permission .
To install the license key, run command tg_ lic_install , specifying the path to the license key file.
If installation is completed successfully, the message "install license successfully" will be displayed in the console. Otherwise, another message "failed to install license" will be displayed.
After a license key has been installed successfully on a TigerGraph system, the information of the installed license is available via the following REST API:
Some license keys include a limit on the graph size, or on the number and size of machines which may be used, or restrict the use of certain optional features. In the case of a memory usage or graph size limit, when a TigerGraph system reaches its license's limit, additional data will not be loaded into the graph. You may still query the graph and delete data. To check whether or not you have exceeded your license limits, use the command gstatusgraph and collect the VertexCount, EdgeCount, and Partition Size. Compare this information to the limits established for your license.
The output may include a warning message such such as the following:
Copyright © TigerGraph. All Rights Reserved.
A TigerGraph system with High Availability (HA) is a cluster of server machines which uses replication to provide continuous service when one or more servers are not available or when some service components fail. TigerGraph HA service provides loading balancing when all components are operational, as well as automatic failover in the event of a service disruption. One TigerGraph server consists of several components (e.g., GSE, GPE, RESTPP). The default HA configuration has a replication factor of 2, meaning that a fully-functioning system maintains two copies of the data, stored on separate machines. Replication factor of 2 is the minimum to support HA configuration. Users can choose to set a higher replication factor depending on their requirements.
An HA cluster needs at least 3 server machines . Machines can be physical or virtual. This is true even the system only has one graph partition.
For a distributed system with N partitions (where N > 1), the system must have at least 2N machines.
The same version of the TigerGraph software package is installed on each machine.
Starting from version 3.0, configuring a HA cluster is part of Platform installation, please check the document TigerGraph Platform Installation Guide for detail.
Follow the instructions in the document TigerGraph Platform Installation Guide to install the TigerGraph system in your cluster.
HA configuration can only be done at the time of system installation and before deploying the system for database use. HA configuration change after installation is not supported. Converting a non-HA system to an HA cluster would require reinstalling all the TigerGraph components and rebuilding the database from the start.
During TigerGraph Platform Installation, please specify ReplicationFactor during the installation for HA configuration. Default value for Replication Factor is 1, which means there is no HA setup for the cluster. If this value is set to a value higher than R (greater than 1), that means each partition should have R replicas.
If R is not a factor of the number of nodes in the cluster, then there might be some nodes unused
If you have a version 1.0 string-type license key, then during initial platform installation, you can either specify your license key as an argument, for example:
Or you may input it when prompted.
To apply a new license key string, use the following command:
If you have a version 2.0 file-type license key which is linked to a specific machine or cluster:
If this is the initial installation or you are updating a previous key file, then please see the document Activating a System-Specific License
If you are updating from a version 1.0 key string to a version 2.0 key file, please contact support@tigergraph.com for the correct procedure.
If you have a version 1.0 string-type license key, the following command will tell you your key's expiration date:
If you have a version 2.0 file-type license key which is linked to a specific machine or cluster, then run the following command:
If you are running TigerGraph v3.0+, run the following command:
A description of each component is given in the Glossary section of the TigerGraph Platform Overview document.
The following command tells you the basic summary of each component:
If you want to know more, including process information, memory/cpu usage information of each component, use the -v option for verbose output.
The default RESTful API port is 9000. It can be changed by configuration. To find out the current RESTful API port, use following command:
The default port for the GraphStudio UI is 14240. (Prior to TigerGraph 1.2, it was 44240.) Use the following to check its configuration:
If you are using a remote GSQL client, it communicates with the GSQL server via port 14240.
To see and edit ports :
GBAR is the utility to do backup and restore of TigerGraph system. Before a backup, GBAR needs to be configured. Please see GBAR - Graph Backup and Restore for details.
To backup the current system:
Please be advised that GBAR only backs up data and configuration. No logs or binaries will be backed up.
To restore an existing backup:
Please be advised that running restore will STOP the service and ERASE existing data.
You can get statistics of Graph data on TigerGraph database instance using gstatusgraph utility:
Due to a known bug, gstatusgraph command will count each undirected edge as two edges. To get an accurate number of undirected edges, user should use the built-in queries instead. The message below is sent as a warning to users when gstatusgraph is used.
"[WARN ] Above vertex and edge counts are for internal use which show approximate topology size of the local graph partition. Use DML to get the correct graph topology information"
TigerGraph provides a RESTful API to tell request statistics. Assuming REST port is 9000, use command below:
If you need to restart everything, use the following:
If you know which component(s) you want to restart,you can list them:
Multiple component names are separated by spaces.
Normally it is not necessary to manually turn off any services. However if you wish to, use the stop command.
There are a few typical causes for a service being down:
Expired license key. Double check your license key expiration date, and contact support@tigergraph.com if it is expired. After applying a new license key, your service will come back online. Usually, TigerGraph will reach out before your license key expires. Please act accordingly when that happens.
Not enough memory. TigerGraph is a memory intensive system. When there is not much free memory, Linux may kill a process based on memory usage. Please check your memory usage after TigerGraph starts. We suggest at least 30% free memory after TigerGraph starts up. To confirm if one of TigerGraph's processes is a victim, use dmesg to check.
Not enough free disk space. TigerGraph writes data, logs, as well as some temporary files onto disk(s). It requires enough free space to function properly. If TigerGraph service or one of its components is down, please check whether there is enough free space on the disk using df .
Use following command to figure out where are log files for each component:
To log at the log file for a particular component:
Timeout is applied to any request coming into TigerGraph system. If a request runs longer than the Timeout value, it will be killed. The default timeout value is 16 second.
If you knows that your query will run longer than the value, configure all related timeouts to a bigger value. To do this:
Input a value you expected, the unit is in second. Then apply the config to the system and restart the service.
The timeout can also be changed for each query, but only when calling the REST endpoint. You would need to use a timeout value each time you run a query, otherwise the default timeout value will be assumed.
A core dump file is produced by the OS when a certain signal causes a process to terminate. The core dump is a disk file containing an image of the process's memory at the time of termination. This image can be used in a debugger (e.g., gdb) to inspect the state of the program at the time that it terminated.
The TigerGraph installation process configures the operating system to place core dump files in the TigerGraph root directory, with the name core-%e-%s-%p.%t, where
%e: executable filename (without path prefix)
%s: signal number which caused the dump
%p: PID of dumped process
%t: time of dump, expressed as seconds since the epoch
The coredump configuration was set by the following command:
If you want to alter the location or file name template, you can edit the contents of /proc/sys/kernel/core_pattern
GBAR - Graph Backup and Restore
GBAR (Graph Backup And Restore), is an integrated tool for backing up and restoring the data and data dictionary (schema, loading jobs, and queries) of a single TigerGraph node. In Backup mode, it packs TigerGraph data and configuration information in a single file onto disk or a remote AWS S3 bucket. Multiple backup files can be archived. Later, you can use the Restore mode to rollback the system to any backup point. This tool can also be integrated easily with Linux cron to perform periodic backup jobs.
The current version of GBAR is intended for restoring the same machine that was backed up. For help with cloning a database (i.e., backing up machine A and restoring the database to machine B), please contact support@tigergraph.com .
The -y option forces GBAR to skip interactive prompt questions by selecting the default answer. There is currently one interactive question:
At the start of restore, GBAR will always ask if it is okay to stop and reset the TigerGraph services: (y/N)? The default answer is yes.
GBAR Config must be run before using GBAR backup/restore functionality.
Note:
For S3 configuration, the AWS access key and secret are not provided, then GBAR will use the attached IAM role.
You can specify the number of parallel processes for backup and restore.
You must provide username and password using GSQL_USERNAME and GSQL_PASSWORD environment variables.
A backup archive is stored as several files in a folder, rather than as a single file. The backup_tag acts like a filename prefix for the archive filename. The full name of the backup archive will be <backup_tag>-<timestamp>, which is a subfolder of the backup repository. If System.Backup.Local.Enable is true, the folder is a local folder on every node in a cluster, to avoid massive data moving across nodes in a cluster. If System.Backup.S3.Enable is true, every node will upload data located on the node to the s3 repository. Therefore, every node in a cluster needs access to Amazon S3. If IAM policy is used for authentication, every node in the cluster needs to be attached with the IAM policy.
GBAR Backup performs a live backup, meaning that normal operations may continue while backup is in progress. When GBAR backup starts, gbar will check loading after gbar backup started, if loading is running, it will pause loading for 1 min, and then continue backup. (u can specify the loading pausing interval by env PAUSE_LOADING). And then, it sends a request to the admin server , which then requests the GPE and GSE to create snapshots of their data. Per the request, the GPE and GSE store their data under GBAR’s own working directory. GBAR also directly contacts the Dictionary and obtains a dump of its system configuration information. In addition, GBAR gathers the TigerGraph system version and customized information including user defined functions, token functions, schema layouts and user-uploaded icons. Then, GBAR compresses each of these data and configuration information files in tgz format and stores them in the <backup_tag>-<timestamp> subfolder on each node. As the last step, GBAR copies that file to local storage or AWS S3, according to the Config settings, and removes all temporary files generated during backup.
The current version of GBAR Backup takes snapshots quickly to make it very likely that all the components (GPE, GSE, and Dictionary) are in a consistent state, but it does not fully guarantee consistency. GBAR will check loading after GBAR backup started, if loading is running, it will pause loading for 1 min, and then continue backup.
Backup does not save input message queues for REST++ or Kafka.
This command lists all generated backup files in the storage place configured by the user. For each file, it shows the file’s full tag, file’s size in human readable format, and it's creation time.
Restore is an offline operation, requiring the data services to be temporarily shut down. The user must specific the full archive name ( <backup_tag>-<timestamp> ) to be restored. When GBAR restore begins, it first searches for a backup archive exactly matching the archive_name supplied in the command line. Then it decompresses the backup files to a working directory. Next, GBAR will compare the TigerGraph system version in the backup archive with the current system's version, to make sure that backup archive is compatible with that current system. It will then shut down the TigerGraph servers (GSE, RESTPP, etc.) temporarily. Then, GBAR makes a copy of the current graph data, as a precaution. Next, GBAR copies the backup graph data into the GPE and GSE and notifies the Dictionary to load the configuration data. Also, gbar will notify the GST to load backup user data and copy the backup user defined token/functions to the right location. When these actions are all done, GBAR will restart the TigerGraph servers.
Note: GBAR restore does not estimate the the uncompressed data size and check whether there is sufficient disk space.
The primary purpose of GBAR is to save snapshots of the data configuration of a TigerGraph system, so that in the future the same system can be rolled back (restored) to one of the saved states. A key assumption is that Backup and Restore are performed on the same machine, and that the file structure of the TigerGraph software has not changed. Specific requirements are listed below.
Restore Requirements and Limitations
Restore is supported if the TigerGraph system has had only minor version updates since the backup.
TigerGraph version numbers have the format X.Y[.Z], where X is the major version number and Y is the minor version number.
Restore is supported if the backup archive and the current system have the same major version number AND the current system has a minor version number that is greater than or equal to the backup archive minor version number.
Backup archives from a 0.8.x system cannot be Restored to a 1.x system.
Examples:
Restore needs enough free space to accommodate both the old gstore and the gstore to be restored.
The following example describes a real example, to show the actual commands, the expected output, and the amount of time and disk space used, for a given set of graph data. For this example, and Amazon EC2 instance was used, with the following specifications:
Single instance with 32 vCPU + 244GB memory + 2TB HDD.
Naturally, backup and restore time will vary depending on the hardware used.
To run a daily backup, we tell GBAR to backup with the tag name daily .
The total backup process took about 31 minutes, and the generated archive is about 49 GB. Dumping the GPE + GSE data to disk took 12 minutes. Compressing the files took another 20 minutes.
To restore from a backup archive, a full archive name needs to be provided, such as daily-20180607232159 . By default, restore will ask the user to approve to continue. If you want to pre-approve these actions, use the "-y" option. GBAR will make the default choice for you.
For our test, GBAR restore took about 23 minutes. Most of the time (20 minutes) was spent decompressing the backup archive.
Note that after the restore is done, GBAR informs you were the pre-restore graph data (gstore) has been saved. After you have verified that the restore was successful, you may want to delete the old gstore files to free up disk space.
TigerGraph supports secure data-in-flight communication, using SSL/TLS encryption protocol. This applies to any outward-facing channel, including GSQL clients, RESTPP endpoints, and the GraphStudio web interface. When SSL/TLS is enabled, HTTPS takes the place of HTTP for RESTPP and GraphStudio connections.
You should have basic knowledge about how SSL works:
What the SSL certificate and key are used for
That a SSL certificate is bound to a domain
How a SSL certificate chain works
TigerGraph uses the Nginx web server, so SSL configuration makes use of some built-in support in Nginx.
The two main options for obtaining a SSL Certificate are to generate your own self-signed certificate or to purchase a certificate from a trusted Certificate Authority. Regardless of which method you choose, your certificate should be chained to a trusted root certificate embedded in your browser. The options and details for producing a trusted SSL certificate are beyond the scope of this document. The focus of this document is how to use a configure your TigerGraph system to use the certificate to enable SSL.
First, obtain a SSL certificate from a trusted agent of your choice. Certificate vendors will provide clear instructions for ordering a certificate and then for installing it on your system.
Then you can configure the certificate with gadmin --configure ssl
There are multiple ways to create a self-signed certificate. One example is shown below.
For simplicity, the method below will use the root certificate directly as the HTTPS server certificate. This method is satisfactory for testing but should not be used for a production system.
In the example below, the Common Name value should be your server hostname, since HTTPS certificates are bound to domain names.
For security reasons, the certificates can only be used with permission 600 or less .
With the self-signed certificate successfully generated, you can configure it with gadmin, so that all the HTTP traffic will be protected with SSL.
After saving the settings, apply the configuration settings.
Then restart the external-facing services: gsql, nginx, and gui.
Now you may test the connection.
A direct curl request to the server will fail due to certificate verification failure:
In v1.2, the default TCP/IP port for Nginx has changed from 44240 to 14240, to avoid possible port conflicts with Zookeeper.
You may use the -k option to turn off the verification, but it is unsafe and not recommended.
To successfully make requests with curl, you will need to specify the certificate by using the --cacert parameter:
Export/Import is a complement to Backup/Restore, not a substitute.
The GSQL EXPORT and IMPORT commands perform a logical backup and restore. A database export contains the database's data, and optionally some types of metadata, which can be subsequently imported in order to recreate the same database, in the original or in a different TigerGraph platform instance.
Available to the superuser role only.
The EXPORT GRAPH command reads the data and metadata for one or more graphs and writes the information to a zip file in the designated folder. If no options are specified, then a full backup is performed, including schema, data, template information, and user profiles.
NOTE: The export directory should be empty before running EXPORT GRAPH because all contents are zipped and compressed.
The current version exports ALL graphs in a MultiGraph system. A future version of EXPORT GRAPH will allow the user to select which graphs to export.
The export contains four categories of files:
Data files in csv format, one file for each type of vertex and each type of edge.
GSQL DDL command files created by the export command. The import command uses these files to recreate the graph schema(s) and reload the data.
Copies of the database's queries, loading jobs, and user defined functions.
GSQL command files used to recreate the users and their privileges.
The following files are created in the specified directory when exporting and are then zipped into a single file called ExportedGraphs.zip.
If the file is password protected, it can only be unzipped using GSQL IMPORT. The security feature prevents users from directly unzipping it.
For each graph called <graphName> in a MultiGraph database, there will be the following files:
DBImportExport_<graphName>.gsql Contains a series of GSQL DDL statements which do the following:
Create the exported graph, along with its local vertex, edge, and tuple types,
Create the loading jobs from the exported graphs
Create data source file objects
Create queries
graph_<graphName>/ - folder containing data for local vertex/edge types in <graphName>. For each vertex or edge type called <type>, there is one of the following two data files:
vertex_<type>.csv
edge_<type>.csv
Jobs used to restore vertex and edge types:
global.gsql - DDL to create all global vertex and edge types, and data sources.
tuple.gsql - DDL to create all User Defined Tuples.
Exported data and jobs used to restore the data:
GlobalTypes/ - folder containing data for global vertex/edge types
vertex_name.csv
edge_name.csv
run_loading_jobs.gsql - DDL created by the export command which will be used during import:
Temporary global schema change job to add user-defined indexes. This schema job is dropped after it is has run.
Loading jobs to load data for global and local vertex/edges.
Database's saved queries, loading jobs, and schema change jobs.
SchemaChangeJob/ - folder containing DDL for schema change jobs. See section "Schema Change Jobs" for more information
Global_Schema_Change_Jobs.gsql contains all global schema change jobs
graphName_Schema_Change_Jobs.gsql contains schema change jobs for each graph "graphName"
Tokenbank.cpp - copy of <tigergraph.root.dir>/dev/gdk/gsql/src/TokenBank/TokenBank.cpp
ExprFunctions.hpp - copy of <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp
ExprUtil.hpp - copy of <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprUtil.hpp
Users:
users.gsql - DDL to create all exported users and import Secrets and Tokens, and grant permissions.
If not enough disk space is available for the data to be exported, the system returns an error message indicating not all data has been exported. Some data may have already been written to disk. If an insufficient disk error occurs, the files will not be zipped, due to the possibility of corrupted data which would then corrupt the zip file. The user should clear enough disk space, including deleting the partially exported data, before reattempting the export.
It is possible for all the files to be written to disk and then to run out of disk space during the zip operation. If that is the case, the system will report this error. The unzipped files will be present in the specified export directory.
If timeout is reached during export, the system returns an error message indicating not all data has been exported. Some data may have already been written to disk. If an insufficient disk error occurs, the files will not be zipped, due to the possibility of corrupted data which would then corrupt the zip file. The user should increase the timeout, and then rerun the export.
The timeout limit is controlled by the session parameter export_timeout. The default timeout is ~138 hours. The change the timeout limit, use the command:
Available to the superuser role only.
The IMPORT GRAPH command unzips the .zip file ExportedGraph.zip located in the designated folder, unzips it, and then runs the GSQL command files within.
WARNING: IMPORT GRAPH looks for specific filenames. If either the zip file or any of its contents are renamed by the user, IMPORT GRAPH may fail.
WARNING: IMPORT GRAPH erases the current database (equivalent to running DROP ALL). The current version does not support incremental or supplemental changes to an existing database (except for the --keep-users option)
There are two sets of loading jobs:
Those that were in the catalog of the database which was exported. These are embedded in the file DBImportExport_graphName.gsql
Those that are created by EXPORT GRAPH and are used to assist with the import process. These are embedded in the file run_loading_jobs,gsql.
The catalog loading jobs are not needed to restore the data. They are included for archival purpose.
Some special rules apply to importing loading jobs. Some catalog loading jobs will not be imported.
If a catalog loading job contains DEFINE FILENAME F = "/path/to/file/"
, the path will be removed and the imported loading job will only contain DEFINE FILENAME F
.
This is to allow a loading job to still be imported even though the file may no longer exist or the path may be different due to moving to another TigerGraph instance.
If a specific file path is used directly in the LOAD statement, and the file cannot be found, the loading job cannot be created and will be skipped.
For example, LOAD "/path/to/file" to vertex v1
cannot be created if /path/to/file
does not exist.
Any file path using $sys.data_root
will be skipped.
This is because the value of $sys.data_root
is not retained from export. During import, $sys.data_root
is set to the root folder of the import location.
There are two sets of schema change jobs:
Those that were in the catalog of the database which was exported. These are stored in the folder /SchemaChangeJobs.
Those that were created by EXPORT GRAPH and are used to assist with the import process. These are in the run_loading_jobs.gsql command file. The jobs are dropped after the import command is finished with them.
The database's schema change jobs are not executed during the import process. This is because if a schema change job had been run before the export, then the exported schema already reflects the result of the schema change job. The directory /SchemaChangeJobs contains these files:
Global_Schema_Change_Jobs.gsql contains all global schema change jobs
<graphName>_Schema_Change_Jobs.gsql contains schema change jobs for each graph <graphName>.
In v3.0, importing and exporting clusters is not fully automated. The database can be exported and imported by following some additional steps.
Rather than creating a single export zip file, export will create a file for each machine. Before exporting, specific folders must be created on each server using the following commands:
Then run the export command on one server. The EXPORT command does not bundle all the files to one server, and it does not compress each server's files to one zip. Some files, including the data files, will be exported to each server, to the folders created above. Some files will be only on the local server where EXPORT GRAPH was run.
You may only import to a cluster that has the same number and configuration of servers as the data from which the export originated. Transfer the files from one export server to a corresponding import server. That is, copy the files from
export_server_n:/path/to/export_directory
to
import_server_n:/path/to/import/directory
2. Manually modify the loading jobs
On the main server, edit the run_loading_jobs.gsql files as follows.
Find the line(s) of the form:
LOAD "sys.data_root/.../<vertex_or_edge_type>.csv"
Close to it should be similar line that is commented out which have the "all:" data source directive:
#LOAD "all:sys.data_root/.../<vertex_or_edge_type>.csv"
See the example below:
Comment out the LOAD line and uncomment the LOAD all: line. Be sure that you do this for all data source files.
3. Run the IMPORT GRAPH command from the main server (e.g., the one that corresponds to the server where EXPORT GRAPH was run).
A good primer on SSL is available to
Backup archive's system version
current system version
Restore is allowed?
0.8
1.0
NO - Major versions differ
1.1
1.1
YES - Major and minor versions are the same
1.1
1.2
YES - Major versions are the same; current minor version > archived minor version
1.1
1.0
NO - Major versions are the same; current minor version < archived minor version
GStore size
Backup file size
Backup time
Restore time
219GB
49GB
31 mins
23 mins
Managing TigerGraph Servers with gadmin
TigerGraph Graph Administrator (gadmin) is a tool for managing TigerGraph servers. It has a self-contained help function and a man page, whose output is shown below for reference. If you are unfamiliar with the TigerGraph servers, please see GET STARTED with TigerGraph.
To see a listing of all the options or commands available for gadmin, run any of the following commands:
After changing a configuration setting, it is generally necessary to run gadmin config apply
. Some commands invoke config apply automatically. If you are not certain, just run
gadmin config apply
Below is the man page for gadmin. Most of the commands are self-explanatory. Common examples are provided with each command.
NOTE: Some commands have changed in v3.0. In particular,
gadmin set <config | license>
has changed to
gadmin <config | license> set
Gadmin autocomplete is more of a feature than a command. It is an auto-complete feature that allows you to see all possible entries of a specific configuration. You can press tab when typing a command to either print out all possible entries, or auto-complete the entry you are currently typing.
The example below shows an example of the autocomplete for the command gadmin status
.
Gadmin config has many sub-entries as well, they will be listed below.
Example : Change the retention size of the kafka queue to 10GB:
Show what configuration changes were made.
Discard the configuration changes without applying them.
Display all configuration entries.
Change a configuration entry.
Get the value of a specific configuration entry.
Configure entries for a specific service group. e.g. KAFKA, GPE, ZK
Initialize your configuration.
List all configurable entries or entry groups.
Options for configuring your license.
Example flow of upgrading a license :
Once the license has been set and config has been applied, you can run gadmin license status
to view the details of your license, including the expiration date and time.
The gadmin log
command will reveal the location of all commonly checked log files for the TigerGraph system.
The gadmin restart
command is used to restart one, many, or all TigerGraph services. You will need to confirm the restarting of services by either entering y (yes) or n (no). To bypass this prompt, you can use the -y flag to force confirmation.
The gadmin start
command can be used to start one, many, or all services.
Check the status of TigerGraph component servers:
Use gadmin status
to report whether each of the main component servers is running (up) or stopped (off). The example below shows the normal status when the graph store is empty and a graph schema has not been defined:
You can also check the status of each instance using the verbose flag : gadmin status -v
or gadmin status --verbose
. This will show each machine's status. See example below
Here are the most common service and process status states you might see from running the gadmin status
command :
Online - The service is online and ready.
Warmup - The service is processing the graph information and will be online soon.
Stopping - The service has received a stop command and will be down soon.
Offline - The service is not available.
Down - The service has been stopped or crashed.
StatusUnknown - The valid status of the service is not tracked.
Init - Process is initializing and will be in the running state soon.
Running - The process is running and available.
Zombie - There is a leftover process from a previous instance.
Stopped - The process has been stopped or crashed.
StatusUnknown - The valid status of the process is not tracked.
The gadmin stop command can be used to stop one, many, or all TigerGraph services. You will need to confirm the restarting of services by either entering y (yes) or n (no). To bypass this prompt, you can use the -y flag to force confirmation.
TigerGraph offers two levels of memory thresholds using the following configuration settings:
SysAlertFreePct and SysMinFreePct
SysAlertFreePct setting indicates that the memory usage has crossed a threshold where the system will start throttling Queries to allow long-running queries to finish and release the memory.
SysMinFreePct setting indicates that the memory usage has crossed a critical threshold and the Queries will start aborting automatically to prevent GPE crash and system stability.
By default, SysMinFreePct is set at 10%, at which point Queries will be aborted.
Example:
SysAlertFreePct=30 means when the system memory consumption is over 70% of the memory, the system will enter alert state and Graph updates will start to slow down.
SysMinFreePct=20 means 20% of the memory is required to be free. When memory consumption enters critical state (over 80% memory consumption) queries will be aborted. automatically.
The TigerGraph graph data store uses a proprietary encoding scheme which both compresses the data and obscures the data unless the user knows the encoding/decoding scheme. In addition, the TigerGraph system supports integration with industry-standard methods for encrypting data when stored in disk ("data at rest").
Data at rest encryption can be applied at many different levels. A user can choose to use one or more level.
File system encryption employs advanced encryption algorithms. Some tools allow the user to select from a menu of encryption algorithms. It can be done either in kernel mode or user mode. To run in kernel mode, superuser permission is required.
Since Linux 2.6, device-mapper has been an infrastructure, which provides a generic way to create virtual layers of block devices with transparent encryption blocks using the kernel crypto API.
If root privilege is not available, a workaround is to use FUSE (Filesystem in User Space) to create a user-level filesystem running on top of the host operating system. While the performance may not be as good as running in kernel mode, there are more options available for customization and tuning.
In this example, we use dm-crypt to provide kernel-mode file system encryption. The dm-crypt utility is widely available and offers a choice of encryption algorithms. It also can be set to encrypt various units of storage – full disk, partitions, logical volumes, or files.
The basic idea of this solution is to create a file, map an encrypted file system to it, and mount it as a storage directory for TigerGraph with R/W permission only to authorized users.
Before you start, you will need a Linux machine on which
you have root permission,
the TigerGraph system has not yet been installed,
and you have sufficient disk space for the TigerGraph data you wish to encrypt. This may be on your local disk or on a separate disk you have mounted.
Install cryptsetup (cryptsetup is included with Ubuntu, but other OS users may need to install it with yum).
Install the TigerGraph system.
Grant sudo privilege to the TigerGraph OS user.
Stop all TigerGraph services with the following commands: gadmin stop all -y gadmin stop admin -y
Acting as the tigergraph OS user, run the following export commands to set variables. Replace the placeholders enclosed in angle brackets <...> with the values of your choice:
Create a file for TigerGraph data storage.
Change the permission of the file so that only the owner of the file (that is, only the tigergraph user who created the file in the previous step) will be able to access it:
Associate a loopback device with the file:
Encrypt storage in the device. cryptsetup will use the Linux device mapper to create, in this case, $encrypted_file_path . Initialize the volume and set a password interactively with the password you set to $encryption_password :
If you are trying to automate the process with a script running with root TTY session , you may use the following command:
Open the partition, and create a mapping to $encrypted_file_path :
If you are trying to automate the process with a script running with root TTY session , you may use the following command:
Clear the password from bash variables and bash history.
The following commands may clear your previous bash histories as well. Instead, you may edit ~/.bash_history to selectively delete the related entries.
Create a file system and verify its status:
Mount the new file system to /mnt/secretfs:
Change the permission to 700 so that only $db_user has access to the file system:
Move the original TigerGraph files to the encrypted filesystem and make a symbolic link. If you wish to encrypt only the TigerGraph data store (called gstore), use the following commands:
There are other TigerGraph files which you might also consider to be sensitive and wish to encrypt. These include the dictionary, kafka data files, and log files. You could selectively identify files to protect or you could encrypt the entire TigerGraph folder(App/Data/Log/TempRoot). In this case, simply move $tigergraph_data_root instead of $tigergraph_data_root/gstore.
The data of TigerGraph data is now stored in an encrypted filesystem. It will be automated decrypted when the tigergraph user (and only this user) accesses it.
To automatically deploy this encryption solution, you may
Chain all the steps as a bash script
Remove all "sudo" since the script will be running as root.
Run the script as root user after TigerGraph Installation.
The setup scripts contain your encryption password. To follow good security procedures, do not leave your password in plaintext format in any files on your disk. Either remove the setup scripts or edit out the password.
Encryption is usually CPU-bound rather than I/O-bound. If CPU usage reamains below 100%, encryption should not cause much performance slowdown. A performance test using both small and large queries supports this prediction: for small (~1 sec) and large (~100 sec) queries, there is a ~5% slowdown due to filesystem encryption.
The basic idea of this solution is to create a file, map an encrypted file system to it, and mount it as a storage directory for TigerGraph with permission only to authorized users.
Angle brackets <...> are used to mark placeholders which you should replace with your own values (without the angle brackets).
If you don't have a KMS key, you can create it first:
Select Create Key , and type in <your-key-alias>
In Step 4 : Define Key Usage Permissions , select <your-role-name>
The role now has permission to use the key.
In this section, you launch a new EC2 instance with the new IAM role and a bootstrap script that executes the steps to encrypt the file system.
The script in this section requires root permission, and it cannot be run manually through an ssh tunnel or by an unprivileged user.
In Step 3: Configure Instance Details
In IAM role , choose <your-role-name>
In User Data , paste the following code block after replacing the placeholders with your values and appending TigerGraph installation script
It may take a few minutes for the script to complete after system launch.
Then, you should be able to launch one or more EC2 machines with an encrypted folder under /mnt/secretfs that only OS user tigergraph can access.
Encryption is usually CPU-bound rather than I/O bound. If CPU usage is below 100%, TigerGraph tests show no significant performance downgrade.
In Ubuntu, full-disk encryption is an option during the OS installation process. For other Linux distributions, the disk can be encrypted with .
A commonly used utility is , which is licensed under GPL, and it is built into some kernels, such as Ubuntu.
We used the TPC-H dataset with scale factor 10 ( ). The data size is 23GB after loading into TigerGraph..The write test (data loading) was done by running a loading job and then killing the GPE with SIGTERM (to exit gracefully) to ensure that all kafka data is consumed.The read test (GSE cold start) measures the time from "gadmin start gse" until "online" appears in "gadmin status gse".
Major cloud service providers often provide their own methodologies for encrypting data at rest. For Amazon EC2, we recommend users start by reading the AWS Security Blog: .
In this section, we provide a simple example for configuring file system encryption for a TigerGraph running on Amazon EC2. The steps are based on those given in , with some additions and modifications.
Make sure you have installed and configured with keys locally.
From the , choose Encryption keys from the navigation pane.
For Step 2 and Step 3 , see for advice.
In the , launch a new instance (see for more details). Amazon Linux AMI 2017.09.1 (HVM), SSD Volume Type (If NOT using Amazon Linux AMI, a script the installs python, pip and AWS CLI needs to be added in the beginning).
Encryption Level | Description | TigerGraph Support |
Hardware | Use specialized hard disks which perform automatic encryption on write and decryption on read (by authorized OS users) | Invisible to TigerGraph |
Kernel-level file system | Use Linux built-in utilities to encrypt data. Root privilege required. | Invisible to TigerGraph |
User-level file system | Use Linux built-in utilities and customized libraries to encrypt data. Root privilege is not required. | Invisible to TigerGraph |
GSE Cold Start (read) | Load Data (write) |
original | 45s | 809s |
encrypted | 47s | 854s |
% slowdown | 4.4% | 5.8% |