Admin Portal, gamin utility, GBAR backup and restore
Loading...
Loading...
Loading...
GBAR - Graph Backup and Restore
GBAR (Graph Backup And Restore), is an integrated tool for backing up and restoring the data and data dictionary (schema, loading jobs, and queries) of a single TigerGraph node. In Backup mode, it packs TigerGraph data and configuration information in a single file onto disk or a remote AWS S3 bucket. Multiple backup files can be archived. Later, you can use the Restore mode to rollback the system to any backup point. This tool can also be integrated easily with Linux cron to perform periodic backup jobs.
The current version of GBAR is intended for restoring the same machine that was backed up. For help with cloning a database (i.e., backing up machine A and restoring the database to machine B), please contact support@tigergraph.com .
The -y option forces GBAR to skip interactive prompt questions by selecting the default answer. There is currently one interactive question:
At the start of restore, GBAR will always ask if it is okay to stop and reset the TigerGraph services: (y/N)? The default answer is yes.
GBAR Config must be run before using GBAR backup/restore functionality.
Note:
For S3 configuration, the AWS access key and secret are not provided, then GBAR will use the attached IAM role.
You can specify the number of parallel processes for backup and restore.
You must provide username and password using GSQL_USERNAME and GSQL_PASSWORD environment variables.
A backup archive is stored as several files in a folder, rather than as a single file. The backup_tag acts like a filename prefix for the archive filename. The full name of the backup archive will be <backup_tag>-<timestamp>, which is a subfolder of the backup repository. If System.Backup.Local.Enable is true, the folder is a local folder on every node in a cluster, to avoid massive data moving across nodes in a cluster. If System.Backup.S3.Enable is true, every node will upload data located on the node to the s3 repository. Therefore, every node in a cluster needs access to Amazon S3. If IAM policy is used for authentication, every node in the cluster needs to be attached with the IAM policy.
GBAR Backup performs a live backup, meaning that normal operations may continue while backup is in progress. When GBAR backup starts, gbar will check loading after gbar backup started, if loading is running, it will pause loading for 1 min, and then continue backup. (u can specify the loading pausing interval by env PAUSE_LOADING). And then, it sends a request to the admin server , which then requests the GPE and GSE to create snapshots of their data. Per the request, the GPE and GSE store their data under GBAR’s own working directory. GBAR also directly contacts the Dictionary and obtains a dump of its system configuration information. In addition, GBAR gathers the TigerGraph system version and customized information including user defined functions, token functions, schema layouts and user-uploaded icons. Then, GBAR compresses each of these data and configuration information files in tgz format and stores them in the <backup_tag>-<timestamp> subfolder on each node. As the last step, GBAR copies that file to local storage or AWS S3, according to the Config settings, and removes all temporary files generated during backup.
The current version of GBAR Backup takes snapshots quickly to make it very likely that all the components (GPE, GSE, and Dictionary) are in a consistent state, but it does not fully guarantee consistency. GBAR will check loading after GBAR backup started, if loading is running, it will pause loading for 1 min, and then continue backup.
Backup does not save input message queues for REST++ or Kafka.
This command lists all generated backup files in the storage place configured by the user. For each file, it shows the file’s full tag, file’s size in human readable format, and it's creation time.
Restore is an offline operation, requiring the data services to be temporarily shut down. The user must specific the full archive name ( <backup_tag>-<timestamp> ) to be restored. When GBAR restore begins, it first searches for a backup archive exactly matching the archive_name supplied in the command line. Then it decompresses the backup files to a working directory. Next, GBAR will compare the TigerGraph system version in the backup archive with the current system's version, to make sure that backup archive is compatible with that current system. It will then shut down the TigerGraph servers (GSE, RESTPP, etc.) temporarily. Then, GBAR makes a copy of the current graph data, as a precaution. Next, GBAR copies the backup graph data into the GPE and GSE and notifies the Dictionary to load the configuration data. Also, gbar will notify the GST to load backup user data and copy the backup user defined token/functions to the right location. When these actions are all done, GBAR will restart the TigerGraph servers.
Note: GBAR restore does not estimate the the uncompressed data size and check whether there is sufficient disk space.
The primary purpose of GBAR is to save snapshots of the data configuration of a TigerGraph system, so that in the future the same system can be rolled back (restored) to one of the saved states. A key assumption is that Backup and Restore are performed on the same machine, and that the file structure of the TigerGraph software has not changed. Specific requirements are listed below.
Restore Requirements and Limitations
Restore is supported if the TigerGraph system has had only minor version updates since the backup.
TigerGraph version numbers have the format X.Y[.Z], where X is the major version number and Y is the minor version number.
Restore is supported if the backup archive and the current system have the same major version number AND the current system has a minor version number that is greater than or equal to the backup archive minor version number.
Backup archives from a 0.8.x system cannot be Restored to a 1.x system.
Examples:
Restore needs enough free space to accommodate both the old gstore and the gstore to be restored.
The following example describes a real example, to show the actual commands, the expected output, and the amount of time and disk space used, for a given set of graph data. For this example, and Amazon EC2 instance was used, with the following specifications:
Single instance with 32 vCPU + 244GB memory + 2TB HDD.
Naturally, backup and restore time will vary depending on the hardware used.
To run a daily backup, we tell GBAR to backup with the tag name daily .
The total backup process took about 31 minutes, and the generated archive is about 49 GB. Dumping the GPE + GSE data to disk took 12 minutes. Compressing the files took another 20 minutes.
To restore from a backup archive, a full archive name needs to be provided, such as daily-20180607232159 . By default, restore will ask the user to approve to continue. If you want to pre-approve these actions, use the "-y" option. GBAR will make the default choice for you.
For our test, GBAR restore took about 23 minutes. Most of the time (20 minutes) was spent decompressing the backup archive.
Note that after the restore is done, GBAR informs you were the pre-restore graph data (gstore) has been saved. After you have verified that the restore was successful, you may want to delete the old gstore files to free up disk space.
Backup archive's system version
current system version
Restore is allowed?
0.8
1.0
NO - Major versions differ
1.1
1.1
YES - Major and minor versions are the same
1.1
1.2
YES - Major versions are the same; current minor version > archived minor version
1.1
1.0
NO - Major versions are the same; current minor version < archived minor version
GStore size
Backup file size
Backup time
Restore time
219GB
49GB
31 mins
23 mins
Export/Import is a complement to Backup/Restore, not a substitute.
The GSQL EXPORT and IMPORT commands perform a logical backup and restore. A database export contains the database's data, and optionally some types of metadata, which can be subsequently imported in order to recreate the same database, in the original or in a different TigerGraph platform instance.
Available to the superuser role only.
The EXPORT GRAPH command reads the data and metadata for one or more graphs and writes the information to a zip file in the designated folder. If no options are specified, then a full backup is performed, including schema, data, template information, and user profiles.
NOTE: The export directory should be empty before running EXPORT GRAPH because all contents are zipped and compressed.
The current version exports ALL graphs in a MultiGraph system. A future version of EXPORT GRAPH will allow the user to select which graphs to export.
The export contains four categories of files:
Data files in csv format, one file for each type of vertex and each type of edge.
GSQL DDL command files created by the export command. The import command uses these files to recreate the graph schema(s) and reload the data.
Copies of the database's queries, loading jobs, and user defined functions.
GSQL command files used to recreate the users and their privileges.
The following files are created in the specified directory when exporting and are then zipped into a single file called ExportedGraphs.zip.
If the file is password protected, it can only be unzipped using GSQL IMPORT. The security feature prevents users from directly unzipping it.
For each graph called <graphName> in a MultiGraph database, there will be the following files:
DBImportExport_<graphName>.gsql Contains a series of GSQL DDL statements which do the following:
Create the exported graph, along with its local vertex, edge, and tuple types,
Create the loading jobs from the exported graphs
Create data source file objects
Create queries
graph_<graphName>/ - folder containing data for local vertex/edge types in <graphName>. For each vertex or edge type called <type>, there is one of the following two data files:
vertex_<type>.csv
edge_<type>.csv
Jobs used to restore vertex and edge types:
global.gsql - DDL to create all global vertex and edge types, and data sources.
tuple.gsql - DDL to create all User Defined Tuples.
Exported data and jobs used to restore the data:
GlobalTypes/ - folder containing data for global vertex/edge types
vertex_name.csv
edge_name.csv
run_loading_jobs.gsql - DDL created by the export command which will be used during import:
Temporary global schema change job to add user-defined indexes. This schema job is dropped after it is has run.
Loading jobs to load data for global and local vertex/edges.
Database's saved queries, loading jobs, and schema change jobs.
SchemaChangeJob/ - folder containing DDL for schema change jobs. See section "Schema Change Jobs" for more information
Global_Schema_Change_Jobs.gsql contains all global schema change jobs
graphName_Schema_Change_Jobs.gsql contains schema change jobs for each graph "graphName"
Tokenbank.cpp - copy of <tigergraph.root.dir>/dev/gdk/gsql/src/TokenBank/TokenBank.cpp
ExprFunctions.hpp - copy of <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp
ExprUtil.hpp - copy of <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprUtil.hpp
Users:
users.gsql - DDL to create all exported users and import Secrets and Tokens, and grant permissions.
If not enough disk space is available for the data to be exported, the system returns an error message indicating not all data has been exported. Some data may have already been written to disk. If an insufficient disk error occurs, the files will not be zipped, due to the possibility of corrupted data which would then corrupt the zip file. The user should clear enough disk space, including deleting the partially exported data, before reattempting the export.
It is possible for all the files to be written to disk and then to run out of disk space during the zip operation. If that is the case, the system will report this error. The unzipped files will be present in the specified export directory.
If timeout is reached during export, the system returns an error message indicating not all data has been exported. Some data may have already been written to disk. If an insufficient disk error occurs, the files will not be zipped, due to the possibility of corrupted data which would then corrupt the zip file. The user should increase the timeout, and then rerun the export.
The timeout limit is controlled by the session parameter export_timeout. The default timeout is ~138 hours. The change the timeout limit, use the command:
Available to the superuser role only.
The IMPORT GRAPH command unzips the .zip file ExportedGraph.zip located in the designated folder, unzips it, and then runs the GSQL command files within.
WARNING: IMPORT GRAPH looks for specific filenames. If either the zip file or any of its contents are renamed by the user, IMPORT GRAPH may fail.
WARNING: IMPORT GRAPH erases the current database (equivalent to running DROP ALL). The current version does not support incremental or supplemental changes to an existing database (except for the --keep-users option)
There are two sets of loading jobs:
Those that were in the catalog of the database which was exported. These are embedded in the file DBImportExport_graphName.gsql
Those that are created by EXPORT GRAPH and are used to assist with the import process. These are embedded in the file run_loading_jobs,gsql.
The catalog loading jobs are not needed to restore the data. They are included for archival purpose.
Some special rules apply to importing loading jobs. Some catalog loading jobs will not be imported.
If a catalog loading job contains DEFINE FILENAME F = "/path/to/file/"
, the path will be removed and the imported loading job will only contain DEFINE FILENAME F
.
This is to allow a loading job to still be imported even though the file may no longer exist or the path may be different due to moving to another TigerGraph instance.
If a specific file path is used directly in the LOAD statement, and the file cannot be found, the loading job cannot be created and will be skipped.
For example, LOAD "/path/to/file" to vertex v1
cannot be created if /path/to/file
does not exist.
Any file path using $sys.data_root
will be skipped.
This is because the value of $sys.data_root
is not retained from export. During import, $sys.data_root
is set to the root folder of the import location.
There are two sets of schema change jobs:
Those that were in the catalog of the database which was exported. These are stored in the folder /SchemaChangeJobs.
Those that were created by EXPORT GRAPH and are used to assist with the import process. These are in the run_loading_jobs.gsql command file. The jobs are dropped after the import command is finished with them.
The database's schema change jobs are not executed during the import process. This is because if a schema change job had been run before the export, then the exported schema already reflects the result of the schema change job. The directory /SchemaChangeJobs contains these files:
Global_Schema_Change_Jobs.gsql contains all global schema change jobs
<graphName>_Schema_Change_Jobs.gsql contains schema change jobs for each graph <graphName>.
In v3.0, importing and exporting clusters is not fully automated. The database can be exported and imported by following some additional steps.
Rather than creating a single export zip file, export will create a file for each machine. Before exporting, specific folders must be created on each server using the following commands:
Then run the export command on one server. The EXPORT command does not bundle all the files to one server, and it does not compress each server's files to one zip. Some files, including the data files, will be exported to each server, to the folders created above. Some files will be only on the local server where EXPORT GRAPH was run.
You may only import to a cluster that has the same number and configuration of servers as the data from which the export originated. Transfer the files from one export server to a corresponding import server. That is, copy the files from
export_server_n:/path/to/export_directory
to
import_server_n:/path/to/import/directory
2. Manually modify the loading jobs
On the main server, edit the run_loading_jobs.gsql files as follows.
Find the line(s) of the form:
LOAD "sys.data_root/.../<vertex_or_edge_type>.csv"
Close to it should be similar line that is commented out which have the "all:" data source directive:
#LOAD "all:sys.data_root/.../<vertex_or_edge_type>.csv"
See the example below:
Comment out the LOAD line and uncomment the LOAD all: line. Be sure that you do this for all data source files.
3. Run the IMPORT GRAPH command from the main server (e.g., the one that corresponds to the server where EXPORT GRAPH was run).
Managing TigerGraph Servers with gadmin
TigerGraph Graph Administrator (gadmin) is a tool for managing TigerGraph servers. It has a self-contained help function and a man page, whose output is shown below for reference. If you are unfamiliar with the TigerGraph servers, please see GET STARTED with TigerGraph.
To see a listing of all the options or commands available for gadmin, run any of the following commands:
After changing a configuration setting, it is generally necessary to run gadmin config apply
. Some commands invoke config apply automatically. If you are not certain, just run
gadmin config apply
Below is the man page for gadmin. Most of the commands are self-explanatory. Common examples are provided with each command.
NOTE: Some commands have changed in v3.0. In particular,
gadmin set <config | license>
has changed to
gadmin <config | license> set
Gadmin autocomplete is more of a feature than a command. It is an auto-complete feature that allows you to see all possible entries of a specific configuration. You can press tab when typing a command to either print out all possible entries, or auto-complete the entry you are currently typing.
The example below shows an example of the autocomplete for the command gadmin status
.
Gadmin config has many sub-entries as well, they will be listed below.
Example : Change the retention size of the kafka queue to 10GB:
Show what configuration changes were made.
Discard the configuration changes without applying them.
Display all configuration entries.
Change a configuration entry.
Get the value of a specific configuration entry.
Configure entries for a specific service group. e.g. KAFKA, GPE, ZK
Initialize your configuration.
List all configurable entries or entry groups.
Options for configuring your license.
Example flow of upgrading a license :
Once the license has been set and config has been applied, you can run gadmin license status
to view the details of your license, including the expiration date and time.
The gadmin log
command will reveal the location of all commonly checked log files for the TigerGraph system.
The gadmin restart
command is used to restart one, many, or all TigerGraph services. You will need to confirm the restarting of services by either entering y (yes) or n (no). To bypass this prompt, you can use the -y flag to force confirmation.
The gadmin start
command can be used to start one, many, or all services.
Check the status of TigerGraph component servers:
Use gadmin status
to report whether each of the main component servers is running (up) or stopped (off). The example below shows the normal status when the graph store is empty and a graph schema has not been defined:
You can also check the status of each instance using the verbose flag : gadmin status -v
or gadmin status --verbose
. This will show each machine's status. See example below
Here are the most common service and process status states you might see from running the gadmin status
command :
Online - The service is online and ready.
Warmup - The service is processing the graph information and will be online soon.
Stopping - The service has received a stop command and will be down soon.
Offline - The service is not available.
Down - The service has been stopped or crashed.
StatusUnknown - The valid status of the service is not tracked.
Init - Process is initializing and will be in the running state soon.
Running - The process is running and available.
Zombie - There is a leftover process from a previous instance.
Stopped - The process has been stopped or crashed.
StatusUnknown - The valid status of the process is not tracked.
The gadmin stop command can be used to stop one, many, or all TigerGraph services. You will need to confirm the restarting of services by either entering y (yes) or n (no). To bypass this prompt, you can use the -y flag to force confirmation.
TigerGraph offers two levels of memory thresholds using the following configuration settings:
SysAlertFreePct and SysMinFreePct
SysAlertFreePct setting indicates that the memory usage has crossed a threshold where the system will start throttling Queries to allow long-running queries to finish and release the memory.
SysMinFreePct setting indicates that the memory usage has crossed a critical threshold and the Queries will start aborting automatically to prevent GPE crash and system stability.
By default, SysMinFreePct is set at 10%, at which point Queries will be aborted.
Example:
SysAlertFreePct=30 means when the system memory consumption is over 70% of the memory, the system will enter alert state and Graph updates will start to slow down.
SysMinFreePct=20 means 20% of the memory is required to be free. When memory consumption enters critical state (over 80% memory consumption) queries will be aborted. automatically.