Declaration and Assignment Statements
In GSQL, different types of variables and objects follow different rules when it comes to variable declaration and assignment. This section discusses the different types of declaration and assignment statements and covers the following subset of the EBNF syntax:
## Declarations ##
accumDeclStmt :=
accumType localAccumName ["=" constant]
["," localASccumName ["=" constant]]*
| accumType globaAccumName ["=" constant]
["," GlobalAccumName ["=" constant]]*
localAccumName := "@"accumName;
globalAccumName := "@@"accumName;
baseDeclStmt := baseType name ["=" constant] ["," name ["=" constant]]*
fileDeclStmt := FILE fileVar "(" filePath ")"
fileVar := name
localVarDeclStmt := baseType varName "=" expr
vSetVarDeclStmt := vertexSetName ["(" vertexType ")"] "=" (seedSet | simpleSet | selectBlock)
simpleSet := vertexSetName
| "(" simpleSet ")"
| simpleSet (UNION | INTERSECT | MINUS) simpleSet
seedSet := "{" [seed ["," seed ]*] "}"
seed := '_'
| ANY
| vertexSetName
| globalAccumName
| vertexType ".*"
| paramName
| "SelectVertex" selectVertParams
selectVertParams := "(" filePath "," columnId "," (columnId | name) ","
stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"]
columnId := "$" (integer | stringLiteral)
## Assignment Statements ##
assignStmt := name "=" expr
| name "." attrName "=" expr
attrAccumStmt := name "." attrName "+=" expr
lAccumAssignStmt := vertexAlias "." localAccumName ("+="| "=") expr
gAccumAssignStmt := globalAccumName ("+=" | "=") expr
loadAccumStmt := globalAccumName "=" "{" LOADACCUM loadAccumParams
["," LOADACCUM loadAccumParams]* "}"
loadAccumParams := "(" filePath "," columnId ["," columnId]* ","
stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"]
## Function Call Statement ##
funcCallStmt := name ["<" type ["," type]* ">"] "(" [argList] ")"
| globaAccumName ("." funcName "(" [argList] ")")+
| "reset_collection_accum" "(" accumName ")"
argList := expr ["," expr]*
Variable scopes
Different types of variable declarations use different scoping rules. There are two types of scoping rules in a GSQL query:
Block scoping
In GSQL, curly brackets, as well as IF .. THEN
, ELSE
, WHILE ... DO
, FOREACH ... DO
statements create a block. A SELECT
statement also creates a block. A block-scoped variable declared inside a block scope is only accessible inside that scope.
Additionally, variables declared in a lower scope can use the same name as a variable already declared in a higher scope.The lower-scope declaration will take precedence over the higher-scope declaration until the end of the lower scope.
The following types of variables use block scoping:
Global scoping
A global-scoped variable is always accessible anywhere in the query once it has been declared regardless of where it is declared. One also cannot declare another variable with the same name as a global-scoped variable that has already been declared.
The following types of variables use global scoping:
Declaration Statements
There are six types of variable declarations in a GSQL query:
-
Accumulator
-
Base type variable
-
Local base type variable
-
Vertex set
-
File object
-
Vertex or edge aliases
The first five types each have their own declaration statement syntax and are covered in this section. Aliases are declared implicitly in a SELECT
statement.
Accumulators
Accumulator declaration is discussed in Accumulators.
Base type variables
In a GSQL query body, variables holding values of types INT
, UINT
, FLOAT
, DOUBLE
, BOOL
, STRING
, DATETIME
, VERTEX
, EDGE
, JSONOBJECT
and JSONARRAY
are called base type variables. The scope of a base type variable is from the point of declaration until the end of the block where its declaration took place.
baseVarDeclStmt := baseType name ["=" expr]["," name ["=" expr]]*
A base type variable can be declared and accessed anywhere in the query.
To declare a base type variable, specify the data type and the variable name.
Optionally, you can initialize the variable by assigning it a value with the assignment operator (=
) and the desired value on the right side.
You can declare multiple variables of the same type in a single declaration statement.
CREATE QUERY base_type_variable() {
STRING a;
DOUBLE num1, num2 = 3.2;
INT year = 2020, month = 12, day = 115;
INT b = rand(5);
PRINT a, b, num1;
}
When a base type variable is assigned a new value in an ACCUM
or POST-ACCUM
clause, the change will not take place until exiting the clause.
Therefore, if there are multiple assignment statements for the same base type variable in an ACCUM
or POST-ACCUM
clause, only the last one will take effect.
For example, in the following query, a base type variable is assigned a new value in the ACCUM
clause, but the change will not take place until the clause ends.
Therefore, the accumulator will not receive the value and will hold a value of 0 at the end of the query.
CREATE QUERY base_type_variable() FOR GRAPH Social_Net {
MaxAccum<INT> @@max_date_glob;
DATETIME dt;
all_user = {Person.*};
all_user = SELECT src
FROM all_user:src - (Liked>:e) - Post
ACCUM
dt = e.action_time, // dt isn't updated yet
@@max_date_glob += datetime_to_epoch(dt);
PRINT @@max_date_glob, dt; // @@max_date_glob will be 0
}
Local base type variables
Base type variables declared in a DML-sub statement, such as in a statement inside an ACCUM
, POST-ACCUM
, or UPDATE SET
clause, are called local base type variables.
Local base type variables are block-scoped and are accessible in the block where they are declared only. Within a local base type variable’s scope, you cannot declare another local base type variable, local container variable, or local tuple variable with the same name at the same level. However, you can declare a local base type variable or local container variable with the same name at a lower level, where the lower-level declaration will take precedence over the previous declaration.
In a POST-ACCUM
clause, each local base type variable may only be used in source vertex statements or only in target vertex statements, not both.
localVarDeclStmt := baseType varName "=" expr
Local base type variables are not subject to the assignment restrictions of regular base type variables. Their values can be updated inside an ACCUM
or POST-ACCUM
clause and the change will take place immediately.
Example:
CREATE QUERY local_variable(VERTEX<Person> m1) FOR GRAPH Social_Net {
// An example showing a local base type variable succeeds
// where a base type variable fails
MaxAccum<INT> @@max_date, @@max_date_glob;
DATETIME dt_glob;
all_user = {Person.*};
all_user = SELECT src
FROM all_user:src - (Liked>:e) - Post
ACCUM
DATETIME dt = e.action_time, // Declare and assign local dt
dt_glob = e.action_time, // dt_glob isn't updated yet
@@max_date += datetime_to_epoch(dt),
@@max_date_glob += datetime_to_epoch(dt_glob);
PRINT @@max_date, @@max_date_glob, dt_glob; // @@max_date_glob will be 0
}
GSQL > RUN QUERY local_variable("person1")
{
"error": false,
"message": "",
"version": {
"schema": 0,
"edition": "enterprise",
"api": "v2"
},
"results": [{
"@@max_date": 1263618953,
"dt_glob": "2010-01-16 05:15:53",
"@@max_date_glob": 0
}]
}
Local container variable
Variables declared inside a DML-block storing container type values are called local container type variables.
Their values can be updated inside an ACCUM
or POST-ACCUM
clause and the change will take place immediately.
Local container variables can store values of a specified type. The following types are allowed:
Container type | Element type |
---|---|
|
|
|
|
|
|
|
|
You must declare which type the container variable will be stored when you declare the container variable.
localContainerDeclStmt := containerType "<" type ">" varName "=" expr
SET<INT> set1 = (1, 2, 3) (1)
1 | The declaration can only take place in a DML block. |
Local container variables are block-scoped and are accessible in the block where they are declared only. Within a local container variable’s scope, you cannot declare another local container variable, local tuple variable, or local base type variable with the same name at the same level. However, you can declare a variable with the same name at a lower level, where the lower-level declaration will override the previous declaration.
In a POST-ACCUM
clause, each local container variable may only be used in source vertex statements or only in target vertex statements, not both.
Query example
In the following example, the SELECT
statement in the main query declares three local container variables, each containing:
-
A base type
-
A user-defined tuple
-
An anonymous tuple
CREATE QUERY test() FOR GRAPH POC_Graph {
TYPEDEF TUPLE<INT i, STRING s> Main_Tuple;
SumAccum<INT> @@A;
SetAccum<Main_Tuple> @@set_acc;
SetAccum<Main_Tuple> @@set_acc2;
L0 = { Person.* };
L1 = SELECT p
FROM L0:p
ACCUM
// Local container with base type
List<INT> a = sub_query1(p),
FOREACH e IN a DO
@@A += e,
// user defined tuple
Set<Main_Tuple> set_A = sub_query2(p)
@@set_acc += set_A,
// anonymous tuple(define signature of tuple in declaration)
Set<tuple<INT, STRING>> set_B = sub_query2(p),
@@set_acc2 += set_B
end;
print @@A;
print @@set_acc;
print @@set_acc2;
}
CREATE QUERY sub_query1 (VERTEX node) FOR GRAPH POC_Graph returns(ListAccum<int>) { ListAccum<INT> @@res; Start = { node }; Result = select t from Start:t Accum @@res += 1; return @@res; }
CREATE QUERY sub_query2 (VERTEX node) FOR GRAPH POC_Graph RETURNS (SetAccum<TUPLE<INT, STRING>>){ TYPEDEF TUPLE<INT i, STRING s> Sub_Tuple; SetAccum<Sub_Tuple> @@res; v_set = { Person.* }; Result1 = SELECT p FROM v_set:p WHERE p.name == "Charlie" ACCUM @@res += Sub_Tuple(-1, "hello"); RETURN @@res; }
Local tuple variable
Variables declared inside a DML-block storing tuple values are called local tuple variables.
The value of a local tuple variable is assigned at declaration.
The value of a local tuple variable can be updated inside an ACCUM
or POST-ACCUM
clause and the change will take place immediately.
Local tuple variables are block-scoped and are accessible in the block where they are declared only. Within a local tuple variable’s scope, you cannot declare another local tuple variable, local container variable, or local base type variable with the same name at the same level. However, you can declare a variable with the same name at a lower level, where the lower-level declaration will override the previous declaration.
You can declare tuple variables of defined types and anonymous types.
Example
CREATE QUERY test_udf() FOR GRAPH POC_Graph {
TYPEDEF TUPLE<INT i, STRING s> Main_Tuple;
SetAccum<Main_Tuple> @@set_acc;
SetAccum<Main_Tuple> @@set_acc2;
L0 = { Person.* };
L1 = SELECT p
FROM L0:p
WHERE p.name == "Charlie"
ACCUM
Main_Tuple a = Main_Tuple(1, "well"), (1)
@@set_acc += a,
TUPLE<INT, STRING> b = Main_Tuple(2, "good"), (2)
@@set_acc2 += b;
PRINT @@set_acc;
PRINT @@set_acc2;
}
1 | This statement defines a local tuple variable a with a defined tuple type. |
2 | This statement defines a local tuple variable b with an anonymous tuple type.
Beside using another tuple type, you can also return an anonymous tuple from a subquery to assign value to the local tuple variable. |
Vertex set variables
Variables that contain a set of one or more vertices are called vertex set variables.
Vertex set variables play a special role within GSQL queries.
They are used for both the input and output of SELECT
statements.
In Syntax V1, before the first SELECT statement in a query, a vertex set variable must be declared and initialized.
This initial vertex set is called the seed set.
The current default syntax - Syntax V2 - no longer has this requirement.
|
Vertex set variables are global-scoped. They are also the only type of variable that isn’t explicitly typed during declaration. To declare a vertex set variable, assign an initial set of vertices to the variable name.
vSetVarDeclStmt := vertexSetName ["(" vertexType ")"] "=" (seedSet | simpleSet | selectBlock)
simpleSet := vertexSetName
| "(" simpleSet ")"
| simpleSet (UNION | INTERSECT | MINUS) simpleSet
seedSet := "{" [seed ["," seed ]*] "}"
seed := '_'
| ANY
| vertexSetName
| globalAccumName
| vertexType ".*"
| paramName
| "SelectVertex" selectVertParams
selectVertParams := "(" filePath "," columnId "," (columnId | name) ","
stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"]
columnId := "$" (integer | stringLiteral)
The query below lists all ways of assigning a vertex set variable an initial set of vertices (that is, forming a seed set).
-
A vertex parameter, untyped or typed, enclosed in curly brackets
-
A vertex set parameter, untyped or typed
-
A global
SetAccum<VERTEX>
accumulator, untyped or typed -
A vertex type followed by
.*
to indicate all vertices of that type, optionally enclosed in curly brackets. -
_
orANY
to indicate all vertices, optionally enclosed in curly brackets. -
A list of vertex IDs in an external file
-
Copy of another vertex set
-
A combination of individual vertices, vertex set parameters, or base type variables, enclosed in curly brackets
-
Union of vertex set variables
CREATE QUERY seed_set_example (VERTEX v1, VERTEX<Person> v2, SET<VERTEX> v3, SET<VERTEX<Person>> v4) FOR GRAPH Social_Net {
SetAccum<VERTEX> @@test_set;
SetAccum<VERTEX<Person>> @@test_set2;
S1 = { v1 }; // Untyped vertex parameter enclosed in curly brackets
S2 = { v2 }; // Typed vertex parameter enclosed in curly brackets
S3 = v3; // Untyped vertex set parameter
S4 = v4; // Typed vertex set parameter
S5 = @@test_set; // Untyped global set accumulator
S6 = @@test_set2; // Typed global set accumulator
S7 = ANY; // All vertices
S8 = Person.*; // All person vertices
S9 = {_}; // Equivalent to ANY, braces are optional
S10 = SelectVertex("absolute_path_to_input_file", $0, Post, ",", false); (1)
S11 = S1; // copy of another vertex set
S12 = {@@test_set, v2, v3}; // Individual vertex: v2
// Vertex set parameter: v3
// Global accumulator: @@test_set
// Inside curly brackets cannot be put another
// Seedset, e.g., S1
S13 = S11 UNION S12; // But we can use UNION to combine S1
}
1 | See SelectVertex(). |
When declaring a vertex set variable, you may opt to specify the vertex set type for the vertex set variable by enclosing the type in parentheses after the variable name.
If the vertex set variable set type is not specified explicitly, GSQL determines the type implicitly by the vertex set value.
The type can be ANY
, _
(equivalent to ANY
), or any explicit vertex type(s).
Assignment
After a vertex set variable is declared, the vertex type of the vertex set variable is immutable.
Every assignment (e.g. SELECT
statement) to this vertex set variable must match the type.
The following is an example in which we must declare the vertex set variable type.
CREATE QUERY vertex_set_variable_type_example (VERTEX<Person> m1) FOR GRAPH Social_Net {
INT ite = 0;
S (ANY) = {m1}; // ANY is necessary
WHILE ite < 5 DO
S = SELECT t
FROM S:s - (ANY:e) - ANY:t;
ite = ite + 1;
END;
PRINT S;
}
In the above example, the query returns the set of vertices after a 5-step traversal from the input person
vertex.
If we declare the vertex set variable S
without explicitly giving a type because the type of vertex parameter m1
is person
, the GSQL engine will implicitly assign S to be person
type.
However, if S
is assigned to person
type, the SELECT
statement inside the WHILE
loop causes a type-checking error, because the SELECT
block will generate all connected vertices, including non-person vertices.
Therefore, S
must be declared as an ANY-type vertex set variable.
FILE
objects
A FILE
object is a sequential text storage object, associated with a text file on the local machine.
fileDeclStmt := FILE fileVar "(" filePath ")"
fileVar := name
When a FILE
object is declared, associated with a particular text file, any existing content in the text file will be erased.
During the execution of the query, content written to or printed to the FILE
object will be appended to the FILE
object.
When the query where the FILE
object is declared finishes running, the content of the FILE
object is saved to the text file.
Example
CREATE QUERY get_US_worker_interests (STRING file_location) FOR GRAPH Work_Net {
// Declare FILE object f1
FILE f1 (file_location);
// Initialize a seed set of all person vertices
P = {Person.*};
PRINT "header" TO_CSV f1;
// Select workers located in the US and print their interests onto
// the FILE object
US_workers = SELECT v FROM P:v
WHERE v.location_id == "us"
ACCUM f1.println(v.id, v.interest_list);
PRINT "footer" TO_CSV f1;
}
INSTALL QUERY get_US_worker_interests
RUN QUERY get_US_worker_interests("/home/tigergraph/fileEx.txt")
Assignment and Accumulate Statements
Assignment statements are used to set or update the value of a variable after it has been declared. This applies to base type variables, vertex set variables, and accumulators. Accumulators also have the special += accumulate statement, which was discussed in the Accumulator section. Assignment statements can use expressions to define the new value of the variable.
## Assignment Statement ##
assignStmt := name "=" expr # baseType variable, vertex set variable
| name "." name "=" expr # attribute of a vertex or edge
attrAccumStmt := name "." attrName "+=" expr
lAccumAssignStmt := vertexAlias "." localAccumName ("+="| "=") expr
gAccumAssignStmt := globalAccumName ("+=" | "=") expr
loadAccumStmt := globalAccumName "=" "{" "LOADACCUM" loadAccumParam
["," "LOADACCUM" loadAccumParams]* "}"
Vertex and edge (non-accumulator) attributes can use the += operator in an ACCUM or POST-ACCUM clause to perform parallel accumulation.
|
Restrictions on Assignment Statements
In general, assignment statements can take place anywhere after the variable has been declared. However, there are some restrictions. These restrictions apply to "inner level" statements which are within the body of a higher-level statement:
-
The
ACCUM
orPOST-ACCUM
clause of aSELECT
statement -
The
SET
clause of anUPDATE
statement -
The body of a
FOREACH
statement
|
LOADACCUM
Statement
loadAccumStmt := globalAccumName "=" "{" LOADACCUM loadAccumParams
["," LOADACCUM loadAccumParams]* "}"
loadAccumParams := "(" filePath "," columnId ["," [columnId]* ","
stringLiteral "," (TRUE | FALSE) ")" ["."FILTER "(" condition ")"]
columnId := "$"(integer | stringLiteral)
LOADACCUM()
can initialize a global accumulator by loading data from a file. LOADACCUM()
has 3+n parameters explained in the table below, where n is the number of fields in the accumulator.
Any accumulator using generic |
Parameters
Name | Type | Description |
---|---|---|
|
String |
The absolute file path of the input file to be read. A relative path is not supported. |
|
String or number |
The column position(s) or column name(s) of the data file that supply data values to each field of the accumulator. |
|
Single-character string |
The separator of columns. |
|
Boolean |
Whether this file has a header. |
One assignment statement can have multiple LOADACCUM()
function calls. However, every LOADACCUM()
referring to the same file in the same assignment statement must use the same separator and header parameter values.
Example
person1,1,"test1",3
person5,2,"test2",4
person6,3,"test3",5
CREATE QUERY load_accum_ex (STRING filename) FOR GRAPH Social_Net {
TYPEDEF TUPLE<STRING aaa, VERTEX<Post> ddd> Your_Tuple;
MapAccum<VERTEX<Person>, MapAccum<INT, Your_Tuple>> @@test_map;
GroupByAccum<STRING a, STRING b, MapAccum<STRING, STRING> strList> @@test_group_by;
@@test_map = { LOADACCUM (filename, $0, $1, $2, $3, ",", false)};
@@test_group_by = { LOADACCUM ( filename, $1, $2, $3, $3, ",", true) };
PRINT @@test_map, @@test_group_by;
}
GSQL > RUN QUERY load_accum_ex("/file_directory/loadAccumInput.csv")
{
"error": false,
"message": "",
"version": {
"edition": "developer",
"schema": 0,
"api": "v2"
},
"results": [{
"@@testGroupBy": [
{
"a": "3",
"b": "\"test3\"",
"strList": {"5": "5"}
},
{
"a": "2",
"b": "\"test2\"",
"strList": {"4": "4"}
}
],
"@@testMap": {
"person1": {"1": {
"aaa": "\"test1\"",
"ddd": "3"
}},
"person6": {"3": {
"aaa": "\"test3\"",
"ddd": "5"
}},
"person5": {"2": {
"aaa": "\"test2\"",
"ddd": "4"
}}
}
}]
}
Function Call Statements
funcCallStmt := name ["<" type ["," type]* ">"] "(" [argList] ")"
| globalAccumName ("." funcName "(" [argList] ")")+
| "reset_collection_accum" "(" accumName ")"
argList := expr ["," expr]*
Typically, a function call returns a value and so is part of an expression.
In some cases, however, the function does not return a value (i.e., returns VOID
) or the return value can be ignored, so the function call can be used as an entire statement. This is a Function Call Statement.
ListAccum<STRING> @@list_acc;
BagAccum<INT> @@bag_acc;
...
// Examples of function call statements
@@list_acc.clear();
@@bag_acc.removeAll(0);
Clear Collection Accumulators
Collection accumulators (e.g., ListAccum
, SetAccum
, MapAccum
) grow in size as data is added. Particularly for vertex-attached accumulators, if the number of vertices is large, their memory consumption can be significant.
It can improve system performance to clear or reset collection accumulators during a query as soon as their data is no longer needed.
Running the reset_collection_accum()
function resets the collection(s) to be zero-length (empty).
If the argument is a vertex-attached accumulator, then the entire set of accumulators is reset.
"reset_collection_accum" "(" accumName ")"
|
CREATE DISTRIBUTED QUERY reset_accum() FOR GRAPH Work_Net SYNTAX v2 {
ListAccum<STRING> @stuff;
ListAccum<STRING> @@all_stuff;
Comp = SELECT c
FROM Person:p -(Works_For:w)- Company:c
ACCUM c.@stuff += p.id,
@@all_stuff += p.id,
c.@stuff += p.location_id,
@@all_stuff += p.location_id,
FOREACH interest IN p.interest_list DO
c.@stuff += interest,
@@all_stuff += interest
END;
// Display accum size: should be full
PRINT Comp[Comp.@stuff.size()] AS stuff_count;
PRINT @@all_stuff.size() AS all_stuff_count;
reset_collection_accum(@stuff);
reset_collection_accum(@@all_stuff);
// display accum size: should be empty
PRINT Comp[Comp.@stuff.size()] AS stuff_clear;
PRINT @@all_stuff.size() AS all_stuff_clear;
}
{
"error": false,
"message": "",
"version": {
"schema": 0,
"edition": "enterprise",
"api": "v2"
},
"results": [
{"stuff_count": [
{
"v_id": "company2",
"attributes": {"Comp.@stuff.size()": 23},
"v_type": "Company"
},
{
"v_id": "company4",
"attributes": {"Comp.@stuff.size()": 7},
"v_type": "Company"
},
{
"v_id": "company3",
"attributes": {"Comp.@stuff.size()": 12},
"v_type": "Company"
},
{
"v_id": "company1",
"attributes": {"Comp.@stuff.size()": 21},
"v_type": "Company"
},
{
"v_id": "company5",
"attributes": {"Comp.@stuff.size()": 4},
"v_type": "Company"
}
]},
{"all_stuff_count": 67},
{"stuff_clear": [
{
"v_id": "company2",
"attributes": {"Comp.@stuff.size()": 0},
"v_type": "Company"
},
{
"v_id": "company4",
"attributes": {"Comp.@stuff.size()": 0},
"v_type": "Company"
},
{
"v_id": "company3",
"attributes": {"Comp.@stuff.size()": 0},
"v_type": "Company"
},
{
"v_id": "company1",
"attributes": {"Comp.@stuff.size()": 0},
"v_type": "Company"
},
{
"v_id": "company5",
"attributes": {"Comp.@stuff.size()": 0},
"v_type": "Company"
}
]},
{"all_stuff_clear": 0}
]
}