System & Language Basics

Language Basics

  • Identifiers Identifiers are user-defined names. An identifier consists of letters, digits, and the underscore. Identifiers may not begin with a digit. Identifiers are case sensitive.

  • Keywords and Reserved Words Keywords are words with a predefined semantic meaning in the language. Keywords are not case sensitive. Reserved words are set aside for use by the language, either now or in the future. Reserved words may not be reused as user-defined identifiers. In most cases, a keyword is also a reserved word. For example, VERTEX is a keyword. It is also a reserved word, so VERTEX may not be used as an identifier.

  • Statements Each line corresponds to one statement (except in multi-line mode). Usually, there is no punctuation at the end of a top-level statement. Some statements, such as CREATE LOADING JOB, are block statements which enclose a set of statements within themselves. Some punctuation may be needed to separate the statements within a block.

  • Comments Within a command file, comments are text that is ignored by the language interpreter. Single line comments begin with either # or //. A comment may be on the same line with interpreted code . Text to the left of the comment marker is interpreted, and text to the right of the marker is ignored. Multi-line comment blocks begin with /* and end with */

Running GSQL

To enter the GSQL shell and work in interactive mode, type gsql from an operating system shell prompt. A user name, password, and a graph name may also be provided on the command line.

GSQL command syntax for entering interactive mode
gsql [-u username] [-p password] [-g gname]

If a user name is provided but not a password, the GSQL system will then ask for the user's password:

Login example with user name
os$ gsql -u victor
Password for victor : ***
GSQL >

If a user name is not given, then GSQL will assume that you are attempting to log in as the default tigergraph user:

Login example without user name
os$ gsql
Password for tigergraph : *****
GSQL >

To exit the GSQL shell, type either exitor quitat the GSQL prompt:

GSQL> EXIT or GSQL> QUIT

GSQL Client Session Timeout

For added security, you can configure your GSQL client session to automatically timeout after a period of inactivity. Set the GSQL_CLIENT_IDLE_TIMEOUT_SEC bash environment variable=<num_sec>. Then every time you start a GSQL session, the idle timeout will be applied. To disable the timeout, omit <num,_sec>. The default setting is no timeout. Example, using the Linux export command to set the environment variable:

tigergraph@123:~$ export GSQL_CLIENT_IDLE_TIMEOUT_SEC=10
tigergraph@123:~$ gsql
Welcome to TigerGraph Developer Edition, free for non-production, research, or educational use.
GSQL-Dev > Session timeout after 10 seconds idle.
tigergraph@123:~$ export GSQL_CLIENT_IDLE_TIMEOUT_SEC=
tigergraph@123:~$ gsql
Welcome to TigerGraph Developer Edition, free for non-production, research, or educational use.

Multiple Shell Sessions

Multiple shell sessions of GSQL may be run at the same time. This feature can be used to have multiple clients (human or machine) using the system to perform concurrent operations. A basic locking scheme is used to maintain isolation and consistency.

Multi-line Mode - BEGIN, END, ABORT

In interactive mode, the default behavior is to treat each line as one statement; the GSQL interpreter will activate as soon as the End-Of-Line character is entered.

Multi-line mode allows the user to enter several lines of text without triggering immediate execution. This is useful when a statement is very long and the user would like to split it into multiple lines. It is also useful when defining a JOB, because jobs typically contain multiple statements.

To enter multi-line mode, use the command BEGIN. The end-of-line character is now disabled from triggering execution. The shell remains in multi-line mode until the command END is entered. The END command also triggers the execution of the multi-line block. In the example below, BEGIN and END are used to allow the SELECT statement to be split into several lines:

Example: BEGIN and END defining a multi-line block
BEGIN
SELECT member_id, last_name, first_name, date_joined, status
FROM Member
WHERE age >= 21
ORDER BY last_name, first_name
END

Alternately, the ABORT command exits multi-line mode and discards the multi-line block.

Command Files and Inline Commands

A command file is a text file containing a series of GSQL statements. Blank lines and comments are ignored. By convention, GSQL command files end with the suffix . gsql , but this is not a requirement. Command files are automatically treated as multi-line mode, so BEGIN and END statements are not needed. Command files may be run either from within the GSQL shell by prefixing the filename with an @ symbol:

GSQL> @file.gsql

or from the operating system (i.e., a Linux shell) by giving the filename as the argument after gsql:

os$ gsql file.gsql

Similarly, a single GSQL command can be run by enclosing the command string in quotation marks and placing it at the end of the GSQL statement. Either single or double quotation marks. It is recommended to use single quotation marks to enclose the entire command and double quotation marks to enclose any strings within the command.

Login example with inline command or command file
gsql [-u username] [-g graphname] ['command_string' | command_file]

In the example below, the file name_query.gsql contains the multi-line CREATE QUERY block to define the query namesSimilar.

Example using command files and inline commands
os$ gsql pagerank_query.gsql
os$ gsql 'INSTALL QUERY namesSimilar'
os$ gsql 'RUN QUERY namesSimilar (0,"michael","jackson",100)'

Help and Information

The help command displays a summary of the available GSQL commands:

GSQL> HELP [BASIC|QUERY]

Note that the HELP command has options for showing more details about certain categories of commands.

The ls command displays the catalog : all the vertex types, edge types, graphs, queries, jobs, and session parameters which have been defined by the user.

--reset option

The --reset option will clear the entire graph data store and erase all related definitions (graph schema, loading jobs, and queries) from the Dictionary. The data deletion cannot be undone; use with extreme caution. The REST++, GPE, and GSE modules will be turned off.

$ gsql --reset
Resetting the catalog.
Shutdown restpp gse gpe ...
Graph store /home/tigergraph/tigergraph/gstore/0/ has been cleared!
The catalog was reset and the graph store was cleared.

Summary

The table below summaries the basic system commands introduced so far.

Command

Description

HELP[BASIC|QUERY]

Display the help menu for all or a subset of the commands

LS

Display the catalog, which records all the vertex types, edge types, graphs, queries, jobs, and session parameters that have been defined for the current active graph. See notes below concerning graph- and role-dependent visibility of the catalog.

BEGIN

Enter multi-line edit mode (only for console mode within the shell)

END

Finish multi-line edit mode and execute the multi-line block.

ABORT

Abort multi-line edit mode and discard the multi-line block.

@file.gsql

Run the gsql statements in the command file file.gsql from within the GSQL shell.

os$ gsql file.gsql

Run the gsql statements in the command file file.gsql from an operating system shell.

os$ gsql 'command_string'

Run a single gsql statement from the operating system shell.

os$ gsql --reset

Clear the graph store and erase the dictionary.

Notes on the LS command

Starting with v1.2, the output of the LS command is sensitive to the user and the active graph:

  1. If the user has not set an active graph or specified "USE GLOBAL":

    1. If the user is a superuser, then LS displays global vertices, global edges, and all graph schemas.

    2. If the user is not a superuser, then LS displays nothing (null).

  2. If the user has set an active graph, then LS displays the schema, jobs, queries, and other definitions for that particular graph.

Session Parameters

Session parameters are built-in system variables whose values are valid during the current session; their values do not endure after the session ends. In interactive command mode, a session starts and ends when entering and exiting interactive mode, respectively. When running a command file, the session lasts during the execution of the command file.

Use the SET command to set the value of a session parameter:

SET session_parameter = value

Session Parameter

Meaning and Usage

sys.data_root

The value should be a string, representing the absolute or relative path to the folder where data files are stored. After the parameter has been set, a loading statement can reference this parameter with $sys.data_root.

gsql_src_dir

The value should be a string, representing the absolute or relative path to the root folder for the gsql system installation. After the parameter has been set, a loading statement can reference this parameter with $gsql_src_dir.

exit_on_error

When this parameter is true (default), if a semantic error occurs while running a GSQL command file, the GSQL shell will terminate. Accepted parameter values: true, false (case insensitive). If the parameter is set to false, then a command file which is syntactically correct will continue running, even if certain runtime errors in individual commands occur. Specifically, this affects these commands:

  • CREATE

  • INSTALL QUERY

  • RUN JOB

Semantic errors include a reference to a nonexistent entity or an improper reuse of an entity.

This session parameter does not affect GSQL interactive mode; GSQL interactive mode does not exit on any error.

This session parameter does not affect syntactic errors: GSQL will always exit on a syntactic error.

Example of exit_on_error = FALSE
# exitOnError.gsql
SET exit_on_error = FALSE
CREATE VERTEX v(PRIMARY_ID id INT, name STRING)
CREATE VERTEX v(PRIMARY_ID id INT, weight FLOAT) #error 1: can't define VERTEX v
CREATE UNDIRECTED EDGE e2 (FROM u, TO v) #error 2: vertex type u doesn't exist
CREATE UNDIRECTED EDGE e1 (FROM v, TO v)
CREATE GRAPH g(v) #error 3: no graph definition has no edge type
CREATE GRAPH g2(*)
Results
os$ gsql exitOnError.gsql
The vertex type v is created.
Semantic Check Fails: The vertex name v is used by another object! Please use a different name.
failed to create the vertex type v
Semantic Check Fails: FROM or TO vertex type does not exist!
failed to create the edge type e2
The edge type e1 is created.
Semantic Check Fails: There is no edge type specified! Please specify at least one edge type!
The graph g could not be created!
Restarting gse gpe restpp ...
Finish restarting services in 11.955 seconds!
The graph g2 is created.

Attribute Data Types

Each attribute of a vertex or edge has an assigned data type. The following types are currently supported.

Primitive Types

name

default value

valid input format (regex)

Range and Precision

description

INT

0

[-+]?[0-9]+

from –2^63 to +2^63 - 1 (-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)

8-byte signed integer

UINT

0

[0-9]+

from 0 to 2^64 - 1 (18,446,744,073,709,551,615)

8-byte unsigned integer

FLOAT

0.0

[ -+ ] ? [ 0 - 9 ] * \. ? [ 0 - 9 ] +( [ eE ] [ -+ ] ? [ 0 - 9 ] + ) ?

+/- 3.4 E +/-38, ~7 bits of precision

4-byte single-precision floating point number Examples: 3.14159, .0065e14, 7E23 See note below.

DOUBLE

0.0

[ -+ ] ? [ 0 - 9 ] * \. ? [ 0 - 9 ] +( [ eE ] [ -+ ] ? [ 0 - 9 ] + ) ?

+/- 1.7 E +/-308, ~15 bits of precision

8-byte double-precision floating point number. Has the same input and output format as FLOAT, but the range and precision are greater. See note below.

BOOL

false

"true", "false" (case insensitive), 1, 0

true, false

boolean true and false, represented within GSQL as true and false , and represented in input and output as 1 and 0

STRING

empty string

.*

UTF-8

character string. The string value can optionally be enclosed by single quote marks or double quote marks. Please see the QUOTE parameter in Section "Other Optional LOAD Clauses".

For FLOAT and DOUBLE values, the GSQL Loader supports exponential notation as shown (e.g., 1.25 E-7).

The GSQL Query Language currently only reads values without exponents. It may display output values with exponential notation, however.

Some numeric expressions may return a non-numeric string result, such as "inf" for Infinity or "NaN" for Not a Number.

Advanced Types

name

default value

supported data format

Range and Precision

description

STRING COMPRESS

(suitable only in limited circumstances)

empty string

.*

UTF-8

string with a finite set of categorical values. More compact storage of STRING, if there is a limited number of different values and the value is rarely accessed. Otherwise, if may use more memory.

DATETIME

UTC time 0

see Section " Loading DATETIME Attribute "

1582-10-15 00:00:00 to 9999-12-31 23:59:59

date and time (UTC) as the number of seconds elapsed since the start of Jan 1, 1970. Time zones are not supported. Displayed in YYYY-MM-DD hh:mm:ss format.

FIXED_BINARY(n )

N/A

N/A

stream of n binary-encoded bytes

Additionally, GSQL also support following complex data types:

Collection Types

  • LIST/SET : A set is a unordered collection of unique elements of the same type; A list is an ordered collection of elements of the same type. A list can contain duplicate elements; a set cannot. The default value of either is an empty list/set. The supported element types of a list or a set are INT, UINT, DOUBLE, FLOAT, STRING, STRING COMPRESS, DATETIME, and UDT. To declare a list or set type, use <> brackets to enclose the element type, e.g., SET<INT>, LIST<STRING COMPRESS>.

Due to multithreaded GSQL loading, the initial order of elements loaded into a LIST might be different than the order in which they appeared in the input data.

  • MAP : A map is a collection of key-value pairs. It cannot contain duplicate keys, and each key maps to one value. The default value is an empty map. The supported key types are INT, STRING, STRING COMPRESS, and DATETIME. The supported value types are INT, DOUBLE, STRING, STRING COMPRESS, DATETIME, and UDT. To declare a map type, use <> to enclose the types, with a comma to separate the key and value types, e.g., MAP<INT, DOUBLE>.

TYPEDEF TUPLE

A User Defined Tuple (UDT) represents an ordered structure of several fields of same or different types. The supported field types are listed below. Each field in a UDT has a fixed size. A STRING field must be given a size in characters, and the loader will only load the first given number of characters. A INT or UINT field can optionally be given a size in bytes.

TYPEDEF TUPLE syntax
TYPEDEF TUPLE syntax
TYPEDEF TUPLE "<" fieldName fieldType ["(" fieldSize ")"]
( "," fieldName fieldType ["(" fieldSize ")"] )* ">" tupleName

Field Type

User-specified size?

Size Choices (in Byte, except STRING)

Range (N is size)

INT

optional

1, 2, 4 (default), 8

0 to 2^(N*8) - 1

UINT

optional

1, 2, 4 (default), 8

-2^(N*8 - 1) to 2^(N*8 - 1) - 1

FLOAT

no

same as FLOAT attribute

DOUBLE

no

same as DOUBLE attribute

DATETIME

no

same as DATETIME attribute

BOOL

no

true, false

STRING

required

Any number of characters

Any string in N characters

To define a UDT, use the TYPEDEF TUPLE statement, before using the UDT as a field in a vertex type or edge type. Below is an example of a TYPEDEF TUPLE statement.

Example: UDT definition
Example: UDT definition
TYPEDEF TUPLE <field1 INT (1), field2 UINT, field3 STRING (10), field4 DOUBLE > myTuple

In this example, myTuple is the name of this UDT. It contains four fields: a 1-byte INT field named field1, a 4-byte UINT field named field2, a 10-character STRING field named field3, and a (8-byte) DOUBLE field named field4.