Backup and Restore
GBAR - Graph Backup and Restore
Graph Backup And Restore (GBAR), is an integrated tool for backing up and restoring the data and data dictionary (schema, loading jobs, and queries) of a TigerGraph instance or cluster.
The backup feature packs TigerGraph data and configuration information into a directory on the local disk or a remote AWS S3 bucket. Multiple backup files can be archived. Later, you can use the restore feature to roll back the system to any backup point. This tool can also be integrated easily with Linux cron to perform periodic backup jobs.

Syntax

The current version of GBAR is intended for restoring the same machine that was backed up. For help with cloning a database (i.e., backing up machine A and restoring the database to machine B), please contact [email protected].
Synopsis
1
Usage: gbar backup [options] -t <backup_tag>
2
gbar restore [options] <backup_tag>
3
gbar list [backup_tag] [-j]
4
gbar remove|rm <backup_tag>
5
gbar cleanup
6
gbar expand [-a] <new_nodes>
7
New nodes must be written in <name>:<host> pairs separated by comma
8
Example:
9
m1:192.168.1.2,m2:192.168.1.3,m3:192.168.1.4
10
11
Options:
12
-h, --help Show this help message and exit
13
-v Run with debug info dumped
14
-vv Run with verbose debug info dumped
15
-y Run without prompt
16
-j Print gbar list as JSON
17
-t BACKUP_TAG Tag for backup file, required on backup
18
-a, --advanced Enable advanced mode for node expansion
Copied!
The -y option forces GBAR to skip interactive prompt questions by selecting the default answer. There is currently one interactive question:
  • At the start of restore, GBAR will always ask if it is okay to stop and reset the TigerGraph services: (y/N)? The default answer is yes.

Configure GBAR

Before using the backup or the restore feature, GBAR must be configured.
  1. 1.
    Run gadmin config entry system.backup. At each prompt, enter the appropriate values for each config parameter.
    1
    $ gadmin config entry system.backup
    2
    3
    System.Backup.TimeoutSec [ 18000 ]: The backup timeout in seconds
    4
    New: 18000
    5
    6
    System.Backup.CompressProcessNumber [ 8 ]: The number of concurrent process for compression during backup
    7
    New: 8
    8
    9
    System.Backup.Local.Enable [ true ]: Backup data to local path
    10
    New: true
    11
    12
    System.Backup.Local.Path [ /tmp/backup ]: The path to store the backup files
    13
    New: /data/backup
    14
    15
    System.Backup.S3.Enable [ false ]: Backup data to S3 path
    16
    New: false
    17
    18
    System.Backup.S3.AWSAccessKeyID [ <masked> ]: The path to store the backup files
    19
    New:
    20
    21
    System.Backup.S3.AWSSecretAccessKey [ <masked> ]: The path to store the backup files
    22
    New:
    23
    24
    System.Backup.S3.BucketName [ ]: The path to store the backup files
    25
    New:
    26
    Copied!
  2. 2.
    After entering the configuration values, run the following command to apply the new configurations
    1
    gadmin config apply -y
    Copied!
Note:
  • You can specify the number of parallel processes for backup and restore.
  • You must provide username and password using GSQL_USERNAME and GSQL_PASSWORD environment variables.
    1
    $ GSQL_USERNAME=tigergraph GSQL_PASSWORD=tigergraph gbar backup -t daily
    Copied!

Perform a backup

To perform a backup, run the following command as the TigerGraph Linux user:
1
gbar backup -t <backup_tag>
Copied!
Depending on your configuration settings, your backup archive will be output to your local backup path and/or your AWS S3 bucket. If you are running a cluster, there will be a backup archive on every node in the same path.
A backup archive is stored as several files in a folder, rather than as a single file. The backup tag acts like a filename prefix for the archive filename. The full name of the backup archive will be <backup_tag>-<timestamp>, which is a subfolder of the backup repository.
  • If System.Backup.Local.Enable is set to true, the folder is a local folder on every node in a cluster, to avoid massive data moving across nodes in a cluster.
  • If System.Backup.S3.Enable is set to true, every node will upload data located on the node to the s3 repository. Therefore, every node in a cluster needs access to Amazon S3.
GBAR Backup performs a live backup, meaning that normal operations may continue while the backup is in progress. When GBAR backup starts, GBAR will check if there are running loading jobs. If there are, it will pause loading for 1 minute to generate a snapshot and then continue the backup process. You can specify the loading pausing interval by the environment variable PAUSE_LOADING.
GBAR then sends a request to the admin server, which then requests the GPE and GSE to create snapshots of their data. Per the request, the GPE and GSE store their data under GBAR’s own working directory. GBAR also directly contacts the Dictionary and obtains a dump of its system configuration information. In addition, GBAR gathers the TigerGraph system version and customized information including user-defined functions, token functions, schema layouts and user-uploaded icons. Then, GBAR compresses each of these data and configuration information files in tgz format and stores them in the <backup_tag>-<timestamp> subfolder on each node. As the last step, GBAR copies that file to local storage or AWS S3, according to the Config settings, and removes all temporary files generated during backup.
The current version of GBAR Backup takes snapshots quickly to make it very likely that all the components (GPE, GSE, and Dictionary) are in a consistent state, but it does not fully guarantee consistency.
Backup does not save input message queues for REST++ or Kafka.

List Backup Files

1
gbar list
Copied!
This command lists all generated backup files in the storage place configured by the user. For each file, it shows the file’s full tag, its size in human-readable format, and its creation time.

Restore from a backup archive

Before restoring a backup, you should ensure that the backup you are restoring from is in the same exact version as your current version of TigerGraph.
To restore a backup, run the following command:
1
gbar restore <archive_name>
Copied!
If GBAR can verify that the backup archive exists and that the backup's system version is compatible with the current system version, GBAR will shut down the TigerGraph servers temporarily as it restores the backup. After completing the restore, GBAR will restart the TigerGraph servers. If you are running a cluster, and you have copied the backup files to each individual node in the cluster, running gbar restore on any node will restore the entire cluster.
Restore is an offline operation, requiring the data services to be temporarily shut down. The user must specify the full archive name ( <backup_tag>-<timestamp> ) to be restored. When GBAR restore begins, it first searches for a backup archive exactly matching the archive name supplied in the command line. Then it decompresses the backup files to a working directory. Next, GBAR will compare the TigerGraph system version in the backup archive with the current system's version, to make sure that the backup archive is compatible with that current system. It will then shut down the TigerGraph servers (GSE, RESTPP, etc.) temporarily. Then, GBAR makes a copy of the current graph data, as a precaution. Next, GBAR copies the backup graph data into the GPE and GSE and notifies the Dictionary to load the configuration data. Also, GBAR will notify the GST to load backup user data and copy the backup user-defined token/functions to the right location. When these actions are all done, GBAR will restart the TigerGraph servers.
Note: GBAR restore does not estimate the uncompressed data size and check whether there is sufficient disk space.
The primary purpose of GBAR is to save snapshots of the data configuration of a TigerGraph system, so that in the future the same system can be rolled back (restored) to one of the saved states. A key assumption is that Backup and Restore are performed on the same machine, and that the file structure of the TigerGraph software has not changed.
Restore needs enough free space to accommodate both the old gstore and the gstore to be restored.

Remove a backup

To remove a backup, run the gbar remove command:
1
$ gbar remove <backup_tag>
Copied!
The command removes a backup from the backup storage path. To retrieve the tag of a backup, you can use the gbar list command.

Clean up temporary files

Run gbar cleanup to delete the temporary files created during backup or restore operations:
1
$ gbar cleanup
Copied!

GBAR Detailed Example

The following example describes a real example, to show the actual commands, the expected output, and the amount of time and disk space used, for a given set of graph data. For this example, an Amazon EC2 instance was used, with the following specifications:
Single instance with 32 vCPU + 244GB memory + 2TB HDD.
Naturally, backup and restore time will vary depending on the hardware used.

GBAR Backup Operational Details

To run a daily backup, we tell GBAR to backup with the tag name daily.
1
$ gbar backup -t daily
2
[23:21:46] Retrieve TigerGraph system configuration
3
[23:21:51] Start workgroup
4
[23:21:59] Snapshot GPE/GSE data
5
[23:33:50] Snapshot DICT data
6
[23:33:50] Calc checksum
7
[23:37:19] Compress backup data
8
[23:46:43] Pack backup data
9
[23:53:18] Put archive daily-20180607232159 to repo-local
10
[23:53:19] Terminate workgroup
11
Backup to daily-20180607232159 finished in 31m33s.
Copied!
The total backup process took about 31 minutes, and the generated archive is about 49 GB. Dumping the GPE + GSE data to disk took 12 minutes. Compressing the files took another 20 minutes.

GBAR Restore Operational Details

To restore from a backup archive, a full archive name needs to be provided, such as daily-20180607232159. By default, restore will ask the user to approve to continue. If you want to pre-approve these actions, use the "-y" option. GBAR will make the default choice for you.
1
$ gbar restore daily-20180607232159
2
[23:57:06] Retrieve TigerGraph system configuration
3
GBAR restore needs to reset TigerGraph system.
4
Do you want to continue?(y/N):y
5
[23:57:13] Start workgroup
6
[23:57:22] Pull archive daily-20180607232159, round #1
7
[23:57:57] Pull archive daily-20180607232159, round #2
8
[00:01:00] Pull archive daily-20180607232159, round #3
9
[00:01:00] Unpack cluster data
10
[00:06:39] Decompress backup data
11
[00:17:32] Verify checksum
12
[00:18:30] gadmin stop gpe gse
13
[00:18:36] Snapshot DICT data
14
[00:18:36] Restore cluster data
15
[00:18:36] Restore DICT data
16
[00:18:36] gadmin reset
17
[00:19:16] gadmin start
18
[00:19:41] reinstall GSQL queries
19
[00:19:42] recompiling loading jobs
20
[00:20:01] Terminate workgroup
21
Restore from daily-20180607232159 finished in 22m55s.
22
Old gstore data saved under /home/tigergraph/tigergraph/gstore with suffix -20180608001836, you need to remove them manually.
Copied!
For our test, GBAR restore took about 23 minutes. Most of the time (20 minutes) was spent decompressing the backup archive.
Note that after the restore is done, GBAR informs you were the pre-restore graph data (gstore) has been saved. After you have verified that the restore was successful, you may want to delete the old gstore files to free up disk space.

Performance Summary of Example

GStore size
Backup file size
Backup time
Restore time
219GB
49GB
31 mins
23 mins
Last modified 8d ago