Admin Portal, gamin utility, GBAR backup and restore
Loading...
Loading...
Loading...
GBAR - Graph Backup and Restore
GBAR (Graph Backup And Restore), is an integrated tool for backing up and restoring the data and data dictionary (schema, loading jobs, and queries) of a single TigerGraph node. In Backup mode, it packs TigerGraph data and configuration information in a single file onto disk or a remote AWS S3 bucket. Multiple backup files can be archived. Later, you can use the Restore mode to rollback the system to any backup point. This tool can also be integrated easily with Linux cron to perform periodic backup jobs.
The current version of GBAR is intended for restoring the same machine that was backed up. For help with cloning a database (i.e., backing up machine A and restoring the database to machine B), please contact support@tigergraph.com .
The -y option forces GBAR to skip interactive prompt questions by selecting the default answer. There is currently one interactive question:
At the start of restore, GBAR will always ask if it is okay to stop and reset the TigerGraph services: (y/N)? The default answer is yes.
Config
For S3 configuration, the AWS access key and secret are not provided, then GBAR will use the attached IAM role.
You can specify the number of parallel processes for backup and restore.
If GSQL authentication is enabled, you must provide a username and password.
Backup
A backup archive is stored as several files in a folder, rather than as a single file.
Distributed backup performance is improved.
Restore
To select a backup archive to restore, the full backup name must be specified.
Restore asks fewer interactive questions than before:
The user must provide a full archive name; there is no option to select the latest from a set of archives.
GBAR restore does not estimate the the uncompressed data size and check whether there is sufficient disk space.
GBAR Config must be run before using GBAR backup/restore functionality. GBAR Config will open the following configuration template interactively in a text editor. Using the comments as a guide, edit the configuration file to set the configuration parameters according to your own needs.
If you do not wish to store the username and password in the config file, you can prepend the user login credentials, as environment variables, to the gbar command you wish to run.
Leaving the config file's username and password fields blank will require you to manually prepend the login information to the gbar command, as seen below.
The backup_tag acts like a filename prefix for the archive filename. The full name of the backup archive will be <backup_tag>-<timestamp>, which is a subfolder of the backup repository. If store_local
is true, the folder is a local folder on every node in a cluster, to avoid massive data moving across nodes in a cluster. If store_s3
is true, every node will upload data located on the node to the s3 repository. Therefore, every node in a cluster needs access to Amazon S3. If IAM policy is used for authentication, every node in the cluster needs to be attached with the IAM policy.
GBAR Backup performs a live backup, meaning that normal operations may continue while backup is in progress. When GBAR backup starts, it sends a request to gadmin , which then requests the GPE and GSE to create snapshots of their data. Per the request, the GPE and GSE store their data under GBAR’s own working directory. GBAR also directly contacts the Dictionary and obtains a dump of its system configuration information. In addition, GBAR records TigerGraph system version. Then, GBAR compresses each of these data and configuration information files in tgz format and stores them in the <backup_tag>-<timestamp> subfolder on each node. As the last step, GBAR copies that file to local storage or AWS S3, according to the Config settings, and removes all temporary files generated during backup.
The current version of GBAR Backup takes snapshots quickly to make it very likely that all the components (GPE, GSE, and Dictionary) are in a consistent state, but it does not fully guarantee consistency. It’s highly recommended when issuing the backup command, no active data update is in progress. A no-write time period of about 5 seconds is sufficient.
Backup does not save input message queues for REST++ or Kafka.
This command lists all generated backup files in the storage place configured by the user. For each file, it shows the file’s full tag, file’s size in human readable format, and its creation time.
Restore is an offline operation, requiring the data services to be temporarily shut down. The user must specific the full archive name ( <backup_tag>-<timestamp> ) to be restored. When GBAR restore begins, it first searches for a backup archive exactly matching the archive_name supplied in the command line. Then it decompresses the backup files to a working directory. Next, GBAR will compare the TigerGraph system version in the backup archive with the current system's version, to make sure that backup archive is compatible with that current system. It will then shut down the TigerGraph servers (GSE, RESTPP, etc.) temporarily. Then, GBAR makes a copy of the current graph data, as a precaution. Next, GBAR copies the backup graph data into the GPE and GSE and notifies the Dictionary to load the configuration data. When these actions are all done, GBAR will restart the TigerGraph servers.
The primary purpose of GBAR is to save snapshots of the data configuration of a TigerGraph system, so that in the future the same system can be rolled back (restored) to one of the saved states. A key assumption is that Backup and Restore are performed on the same machine, and that the file structure of the TigerGraph software has not changed. Specific requirements are listed below.
Restore Requirements and Limitations
Restore is supported if the TigerGraph system has had only minor version updates since the backup.
TigerGraph version numbers have the format X.Y[.Z], where X is the major version number and Y is the minor version number.
Restore is supported if the backup archive and the current system have the same major version number AND the current system has a minor version number that is greater than or equal to the backup archive minor version number.
Backup archives from a 0.8.x system cannot be Restored to a 1.x system.
Examples:
Restore needs enough free space to accommodate both the old gstore and the gstore to be restored.
The following example describes a real example, to show the actual commands, the expected output, and the amount of time and disk space used, for a given set of graph data. For this example, and Amazon EC2 instance was used, with the following specifications:
Single instance with 32 vCPU + 244GB memory + 2TB HDD.
Naturally, backup and restore time will vary depending on the hardware used.
To run a daily backup, we tell GBAR to backup with the tag name daily .
The total backup process took about 31 minutes, and the generated archive is about 49 GB. Dumping the GPE + GSE data to disk took 12 minutes. Compressing the files took another 20 minutes.
To restore from a backup archive, a full archive name needs to be provided, such as daily-20180607232159 . By default, restore will ask the user to approve to continue. If you want to pre-approve these actions, use the "-y" option. GBAR will make the default choice for you.
For our test, GBAR restore took about 23 minutes. Most of the time (20 minutes) was spent decompressing the backup archive.
Note that after the restore is done, GBAR informs you were the pre-restore graph data (gstore) has been saved. After you have verified that the restore was successful, you may want to delete the old gstore files to free up disk space.
Backup archive's system version
current system version
Restore is allowed?
0.8
1.0
NO - Major versions differ
1.1
1.1
YES - Major and minor versions are the same
1.1
1.2
YES - Major versions are the same; current minor version > archived minor version
1.1
1.0
NO - Major versions are the same; current minor version < archived minor version
GStore size
Backup file size
Backup time
Restore time
219GB
49GB
31 mins
23 mins
Managing TigerGraph Servers with gadmin
TigerGraph Graph Administrator (gadmin) is a tool for managing TigerGraph servers. It has a self-contained help function and a man page, whose output is shown below for reference. If you are unfamiliar with the TigerGraph servers, please see GET STARTED with TigerGraph.
To see a listing of all the options or commands available for gadmin, run any of the following commands:
After changing a configuration setting, it is generally necessary to run gadmin config-apply. Some commands invoke config-apply automatically. If you are not certain, just run config-apply
Below is the man page for gadmin. Most of the commands are self-explanatory.
Checking the status of TigerGraph component servers:
Use "gadmin status" to report whether each of the main component servers is running (up) or stopped (off). The example below shows the normal status when the graph store is empty and a graph schema has not been defined:
Stopping a particular server, such as the rest server (name is “restpp"):
Changing the retention size of queue to 10GB:
A TigerGraph license key is initially set up during the installation process. If you have obtained a new license key, run the command
to install your new key. You should then follow this with
TigerGraph offers two levels of memory thresholds using the following configuration settings:
SysAlertFreePct and SysMinFreePct
SysAlertFreePct setting indicates that the memory usage has crossed a threshold where the system will start throttling Queries to allow long-running queries to finish and release the memory.
SysMinFreePct setting indicates that the memory usage has crossed a critical threshold and the Queries will start aborting automatically to prevent GPE crash and system stability.
By default, SysMinFreePct is set at 10%, at which point Queries will be aborted.​
Example:
SysAlertFreePct=30 means when the system memory consumption is over 70% of the memory, the system will enter alert state and Graph updates will start to slow down.
SysMinFreePct=20 means 20% of the memory is required to be free. When memory consumption enters critical state (over 80% memory consumption) queries will be aborted. automatically.​
TigerGraph Admin Portal UI Guide
The TigerGraph Admin Portal is a browser-based dashboard which provides users an overview of a running TigerGraph system, from an application and infrastructure point of view. It also allows the users to configure the TigerGraph system through a user-friendly interface. This guide serves as an introduction and quick-start manual for Admin Portal.
As of June 2018, the Admin Portal is certified on following browsers:
Not all features are guaranteed to work on other browsers.
Please make sure to enable JavaScript and cookies in your browser settings.
The Admin Portal and GraphStudio share the same port (14240). If you are logged in one of the servers for your TigerGraph system, then you can use localhost
for your <tigergraph_server_ip_address>. The Admin Portal is on the admin page:
If user authentication has been enabled, then users need to log in to access the Admin Portal.
If you are already at GraphStudio, simply click the Admin button at the right end of the top menu bar.
The Admin Portal has two pages: Dashboard and Configuration . Both pages have the same Header, Footer, and Navigation Menu.
The layout of the Admin Portal is responsive to screen size. The layout will automatically adjust for devices with small screens like phones and tablets.
The full screen version of the Admin Portal is shown below, with the Dashboard page selected.
The mobile version is shown below:
To view the full text, you can click on a notification to open a popup window containing the full message and its severity:
There are three severity levels: info, warning and error.
You can switch between a dark theme and light theme. The light theme is shown below:
To sign out of the Admin Portal, click on the Sign out button in the Account menu.
Clicking on the Help button will take you to the documentation page containing this guide.
Green indicates all services are online.
Gray means one or more service statuses is unknown.
Red means on of the component services is offline.
Clicking on the button will show you the list of statuses for the services in our system:
The Dashboard page has three main parts: Overall Statistics, the Time Range Picker, and several Charts.
Just below the page header, there are four cards showing statistics of our system, including number of nodes, number of graphs, number of vertices and number of edges. These statistics are refreshed live. (The default refresh interval is 1 minute).
"Now" means that the charts will be continually updated with the most recent data.
"Custom" lets you select a fixed date. The time range is historical, so the charts will be static.
The sliding bar on the right lets you fine tune the range. Click and drag an endpoint to adjust the start or end time.
Changing any of these selections will trigger a request for statistics data and the chart will be re-rendered accordingly.
Each charts displays some statistic or state information on the vertical axis and time on the horizontal axis.
There are two chart sections. The first section is GSQL Query Performance. This lists all of the queries accessible to the current user. If you click on a query name, the display will expand to show detailed charts about that query. You can expand only one query panel at a time. The second section is Cluster Monitoring. This lists all of the machines within the TigerGraph cluster. Similar to the first section, you can only expand one panel at a time.
A Query Monitoring Panel includes three charts:
QPS (number of Queries completed per second)
Timeout (fraction of the query calls which timed out and therefore did not finish)
Latency (minimum, maximum, and average time to complete a query)
A Machine Monitoring Panel includes 4 charts. The first three charts break down the information among three processing-focused components (GPE, GSE, RESTPP). The last chart breaks down information among three components which may have large storage needs (GStore, Log files, and Apache Kafka).
Service status: ON or OFF status for the given component
CPU Usage: percentage of available CPU time used by the given component
Memory Usage: GB used by the given component
Disk Usage: GB used by the given component
Currently (as of v2.2), the Configuration page supports one configuration operation: updating the GraphStudio license key.
Additional configuration operations, which are currently only available from a Linux console, will be added in future releases.
An example of the GraphStudio License Update panel is shown below. The panel displays the full information about your license, including the expiration date.
To apply a new license key, paste the key into the text box below "Enter GraphStudio license" and click Update.
Clicking on the Notification iconwill open up a list of notifications. If a notification is too long, some of its content will be omitted:
The Account iconwill open the user menu:
You can navigate to GraphStudio by clicking on.
The overall system statusis always shown in the footer. This single indicator shows:
You can start or stop services from the Admin Portal by using the right most buttons(NOTE: ONLY a superuser can see these buttons).
Clicking on the Stop iconwill stop all of the services in the TigerGraph system.
Clicking on the Start iconwill start all of the services in the TigerGraph system (NOTE: because there is an interval between data collection period, the real status of the system will not be reflected in the status section right away).
The next card lets you set the time range to be used for the statistics in the charts below. The leftmost inputlets you select the start time of the range. The next inputlets you select the end time of the range. This has two options:
Browser
Chrome
Safari
Firefox
Opera
Supported version
54.0+
11.1+
59.0+
52.0+