Declaration and Assignment Statements
In GSQL, different types of variables and objects follow different rules when it comes to variable declaration and assignment. This section discusses the different types of declaration and assignment statements and covers the following subset of the EBNF syntax:
Variable scopes
Different types of variable declarations use different scoping rules. There are two types of scoping rules in a GSQL query:
Block scoping
In GSQL, curly brackets, as well as IF .. THEN
, ELSE
, WHILE ... DO
, FOREACH ... DO
statements create a block. A SELECT
statement also creates a block. A block-scoped variable declared inside a block scope is only accessible inside that scope.
Additionally, variables declared in a lower scope can use the same name as a variable already declared in a higher scope. The lower-scope declaration will take precedence over the higher-scope declaration until the end of the lower scope.
The following types of variables use block scoping:
Global scoping
A global-scoped variable is always accessible anywhere in the query once it has been declared regardless of where it is declared. One also cannot declare another variable with the same name as a global-scoped variable that has already been declared.
The following types of variables use global scoping:
Declaration Statements
There are six types of variable declarations in a GSQL query:
Accumulator
Base type variable
Local base type variable
Vertex set
File object
Vertex or edge aliases
The first five types each have their own declaration statement syntax and are covered in this section. Aliases are declared implicitly in a SELECT
statement.
Accumulators
Accumulator declaration is discussed in Accumulators.
Base type variables
In a GSQL query body, variables holding values of types INT
, UINT
, FLOAT
, DOUBLE
, BOOL
, STRING
, DATETIME
, VERTEX
, EDGE
, JSONOBJECT
and JSONARRAY
are called base type variables. The scope of a base type variable is from the point of declaration until the end of the block where its declaration took place.
A base type variable can be declared and accessed anywhere in the query. To declare a base type variable, specify the data type and the variable name. Optionally, you can initialize the variable by assigning it a value with the assignment operator (=
) and the desired value on the right side. You can declare multiple variables of the same type in a single declaration statement.
When a base type variable is assigned a new value in an ACCUM
or POST-ACCUM
clause, the change will not take place until exitng the clause. Therefore, if there are multiple assignment statements for the same base type variable in an ACCUM
or POST-ACCUM
clause, only the last one will take effect.
For example, in the following query, a base type variable is assigned a new value in the ACCUM
clause, but the change will not take place until the clause ends. Therefore, the accumulator will not receive the value and will hold a value of 0 at the end of the query.
Local base type variables
Base type variables declared in a DML-sub statement, such as in a statement inside a ACCUM
, POST-ACCUM
, or UPDATE SET
clause, are called local base type variables.
Local base type variables are block-scoped and are accessible in the block where they are declared only. Within a local base type variable's scope, another local base type variable with the same name cannot be declared at the same level. However, a new local base type variable with the same name can be declared at a lower level (i.e., within a nested SELECT
or UPDATE
statement). The lower declaration takes precedence at the lower level.
In a POST-ACCUM
clause, each local base type variable may only be used in source vertex statements or only in target vertex statements, not both.
Local base type variables are not subject to the assignment restrictions of regular base type variables. Their values can be updated inside an ACCUM
or POST-ACCUM
clause and the change will take place immediately.
Example:
Vertex Set Variable Declaration and Assignment
Variables that contain a set of one or more vertices are called vertex set variables. Vertex set variables play a special role within GSQL queries. They are used for both the input and output of SELECT
statements. Therefore, before the first SELECT
statement in a query, a vertex set variable must be declared and initialized. This initial vertex set is called the seed set.
Vertex set variables are global-scoped. They are also the only type of variable that isn't explicitly typed during declaration. To declare a vertex set variable, assign an initial set of vertices to the variable name.
The query below lists all ways of assigning a vertex set variable an initial set of vertices (that is, forming a seed set).
A vertex parameter, untyped or typed, enclosed in curly brackets
A vertex set parameter, untyped or typed
A global
SetAccum<VERTEX>
accumulator, untyped or typedAll vertices of any type or of one type
A list of vertex IDs in an external file
Copy of another vertex set
A combination of individual vertices, vertex set parameters, or base type variables, enclosed in curly brackets
Union of vertex set variables
When declaring a vertex set variable, a set of vertex types can be optionally specified to the vertex set variable. If the vertex set variable set type is not specified explicitly, the system determines the type implicitly by the vertex set value. The type can be ANY
, _
(equivalent to ANY), or any explicit vertex type(s). See the EBNF grammar rule vertexEdgeType
.
Declaration syntax difference: vertex set variable vs. base type variable
In a vertex set variable declaration, the optional type specifier follows the variable name and should be surrounded by parentheses: vSetName(type)
This is different than a base type variable declaration, where the type specifier is required and comes before the base variable name: type varName
Assignment
After a vertex set variable is declared, the vertex type of the vertex set variable is immutable. Every assignment (e.g. SELECT
statement) to this vertex set variable must match the type. The following is an example in which we must declare the vertex set variable type.
In the above example, the query returns the set of vertices after a 5-step traversal from the input person
vertex. If we declare the vertex set variable S
without explicitly giving a type, because the type of vertex parameter m1
is person
, the GSQL engine will implicitly assign S to be person
type. However, if S
is assigned to person
type, the SELECT
statement inside the WHILE
loop causes a type-checking error, because the SELECT
block will generate all connected vertices, including non-person vertices. Therefore, S
must be declared as an ANY-type vertex set variable.
FILE
Object Declaration
FILE
Object DeclarationA FILE
object is a sequential text storage object, associated with a text file on the local machine.
When a FILE
object is declared, associated with a particular text file, any existing content in the text file will be erased. During the execution of the query, content written to or printed to the FILE
object will be appended to the FILE
object. When the query where the FILE
object is declared finishes running, the content of the FILE
object is saved to the text file.
Example:
Assignment and Accumulate Statements
Assignment statements are used to set or update the value of a variable after it has been declared. This applies to base type variables, vertex set variables, and accumulators. Accumulators also have the special += accumulate statement, which was discussed in the Accumulator section. Assignment statements can use expressions to define the new value of the variable.
Vertex and edge (non-accumulator) attributes can use the += operator in an ACCUM or POST-ACCUM clause to perform parallel accumulation.
Restrictions on Assignment Statements
In general, assignment statements can take place anywhere after the variable has been declared. However, there are some restrictions. These restrictions apply to "inner level" statements which are within the body of a higher-level statement:
The
ACCUM
orPOST-ACCUM
clause of aSELECT
statementThe
SET
clause of anUPDATE
statementThe body of a
FOREACH
statement
Global accumulator assignment is not permitted within the body of
SELECT
orUPDATE
statementsBase type variable assignment is permitted in
ACCUM
orPOST-ACCUM
clauses, but the change in value will not take place until exiting the clause. Therefore, if there are multiple assignment statements for the same variable, only the final one will take effect.Vertex attribute assignment is not permitted in an
ACCUM
clause. However, edge attribute assignment is permitted. This is because theACCUM
clause iterates over an edge set.There are additional restrictions within
FOREACH
loops for the loop variable. See the Data Modification section.
LOADACCUM
Statement
LOADACCUM
StatementLOADACCUM()
can initialize a global accumulator by loading data from a file. LOADACCUM()
has 3+n parameters explained in the table below, where n is the number of fields in the accumulator.
Any accumulator using generic VERTEX
as an element type cannot be initialized by LOADACCUM()
.
Parameters:
One assignment statement can have multiple LOADACCUM()
function calls. However, every LOADACCUM()
referring to the same file in the same assignment statement must use the same separator and header parameter values.
Example:
Function Call Statements
Typically, a function call returns a value and so is part of an expression. In some cases, however, the function does not return a value (i.e., returns VOID
) or the return value can be ignored, so the function call can be used as an entire statement. This is a Function Call Statement.
Clear Collection Accumulators
Collection accumulators (e.g., ListAccum
, SetAccum
, MapAccum
) grow in size as data is added. Particularly for vertex-attached accumulators, if the number of vertices is large, their memory consumption can be significant. It can improve system performance to clear or reset collection accumulators during a query as soon as their data is no longer needed. Running the reset_collection_accum(accumName)
function resets the collection(s) to be zero-length (empty). If the argument is a vertex-attached accumulator, then the entire set of accumulators is reset.
reset_collection_accum
only works in DISTRIBUTED mode queries. If the query is not in distributed mode, the reset does not take place.
Last updated