An expression is a combination of fixed values, variables, operators, function calls, and groupings that specify a computation, resulting in a data value. This section of the specification describes the literals (fixed values), operators, and functions available in the GSQL query language. It covers the subset of the EBNF definitions shown below. However, more so than in other sections of the specification, syntax alone is not an adequate description. The semantics (functionality) of the particular operators and functions are an essential complement to the syntax.
Each primitive data type supports constant values:
GSL_UINT_MAX
= 2 ^ 64 - 1 = 18446744073709551615
GSQL_INT_MAX
= 2 ^ 63 - 1 = 9223372036854775807
GSQL_INT_MIN
= -2 ^ 63 = -9223372036854775808
An operator is a keyword token that performs a specific computational function to return a resulting value, using the adjacent expressions (its operands) as input values. An operator is similar to a function in that both compute a result from inputs, but syntactically they are different. The most familiar operators are the mathematical operators for addition +
and subtraction -
.
Tip: The operators listed in this section are designed to behave like the operators in MySQL.
We support the following standard mathematical operators and meanings. The latter four ("<<" | ">>" | "&" | "|") are for bitwise operations. See the section below: "Bit Operators".
Operator precedences are shown in the following list, from the highest precedence to the lowest. Operators that are shown together on a line have the same precedence:
We support the standard Boolean operators and standard order of precedence: AND, OR, NOT
Bit operators (<<, >>, &, and |) operate on integers and return an integer.
Operator + can be used for concatenating strings.
The fields of the tuple can be accessed using the dot operator.
A condition is an expression that evaluates to a boolean value of either true or false. One type of condition uses the familiar comparison operators. A comparison operator compares two numeric or string values.
Strings are compared based on standard lexicographical ordering: (space) < (digit) < (uppercase_letter) < (lowercase_letter).
The comparison operators treat the STRING COMPRESS type as though it is STRING type.
The expression expr1 BETWEEN expr2 AND expr3 is true if the value expr1 is in the range from expr2 to expr3, including the endpoint values. Each expression must be numeric.
" expr1 BETWEEN expr2 AND expr3 " is equivalent to " expr1 <= expr3 AND expr1 >= expr2".
IS NULL and IS NOT NULL can be used for checking whether an optional parameter is given any value.
Every attribute value stored in GSQL is a valid value, so IS NULL and IS NOT NULL is only effective for query parameters.
The LIKE
operator is used for string pattern matching and can only be used in WHERE
clauses. The expression string1 LIKE string_pattern
evaluates to boolean true if string1
matches the pattern in string_pattern
; otherwise, it is false.
Both operands must be strings. Additionally, while string1
can be a function call (e.g. lower(string_variable)
, string_pattern
must be a string literal or a parameter. A string_pattern
can contain characters as well as the following wildcard and other special symbols, in order to express a pattern (<char_list>
indicates a placeholder):
ESCAPE escape_char
The optional ESCAPE escape_char
clause is used to define an escape character. When escape_char
occurs in string_pattern
, then the next character is interpreted literally, instead of as a pattern matching operator. For example, if we want to specify the pattern "any string ending with the '%'
character", we could use
"%\%" ESCAPE "\"
The first "%"
has its usual pattern-matching meaning "zero or more characters".
"\%"
means a literal percentage character, because it starts with the escape character "\"
.
Attributes on vertices or edges are defined in the graph schema. Additionally, each vertex and edge has a built-in STRING attribute called type which represents the user-defined type of that edge or vertex. These attributes, including type, can be accessed for a particular edge or vertex with the dot operator:
DYNAMIC Query Support
For example, the following code snippet shows two different SELECT statements which produce equivalent results. The first uses the dot operator on the vertex variable v to access the "subject" attribute, which is defined in the graph schema. The FROM clause in the first SELECT statement necessitates that any target vertices will be of type "post" (also defined in the graph schema). The second SELECT schema checks that the vertex variable v's type is a "post" vertex by using the dot operator to access the built-in type attribute.
This section describes functions that apply to all or most accumulators. Other accumulator functions for each accumulator type are illustrated in the "Accumulator Type" section.
The tick operator ( ' ) can be used to read the value of an accumulator as it was at the start an ACCUM clause, before any changes that took place within the ACCUM clause. It can only be used in the POST-ACCUM clause. A typical use is to compare the value of the accumulator before and after the ACCUM clause. The PageRank algorithm provides a good example:
In the last line, we compute @@max_diff
as the absolute value of the difference between the post-ACCUM score (s.@score
) and the pre-ACCUM score (s.@score'
).
SELECT blocks take an input vertex set and perform various selection and filtering operations to produce an output set. Therefore, set/bag expressions and their operators are a useful and powerful part of the GSQL query language. A set/bag expression can use either SetAccum or BagAccum.
The operators are straightforward, when two operands are both sets, the result expression is a set. When at least one operand is a bag, the result expression is a bag. If one operand is a bag and the other is a set, the operator treats the set operant as a bag containing one of each value.
The result of these operators is another set or bag, so these operations can be nested and chained to form more complex expressions, such as
For example , suppose setBagExpr_A is ("a", "b", "c")
The IN and NOT IN operators support all base types on the left-hand side, and any set/bag expression on the right-hand side. The base type must be the same as the accumulator's element type. IN and NOT IN return a BOOL value.
The following example uses NOT IN to exclude neighbors that are on a blocked list.
A query defined with a RETURNS
header following its CREATE
statement is called a subquery. A subquery acts as a callable function in GSQL. They take parameters, perform a set of actions, and return a value at the end. A subquery must end with a return statement to pass its output value to a query. Exactly one type is allowed in the RETURNS
header, and thus the RETURN
statement can only return one expression.
A subquery must be created before the query that calls the subquery. A subquery must be installed either before or in the same INSTALL QUERY
command with the query that calls the subquery.
A subquery parameter can only be one of the following types:
Primitives: INT
, UINT
, FLOAT
, DOUBLE
, STRING
, BOOL
VERTEX
A set or bag of primitive or VERTEX
elements
A subquery's return value can be any base type or accumulator type with the following exceptions.
If the return type is a user-defined tuple type, a HeapAccum type, or a GroupByAccum type, the user-defined types must be defined at the catalog level.
If the return type is a BagAccum
. SetAccum
, or ListAccum
with a tuple as its element, the tuple does not need to be defined at the catalog level and can be anonymous.
Recursion is supported for subqueries and a subquery can call itself. Here is an example of a recursive subquery: The following subquery takes a set of persons as starting points, and returns all the friends within a given distance.
While recursive subqueries may look simpler in writing, they are usually not as efficient as iterative subqueries in GSQL.
Test cases: Starting from person1
, search to a distance of 1 and a distance of 2.
Below is a list of examples of expressions. Note that ( argList ) is a set/bag expression, while [ argList ] is a list expression.
Data Type
Constant
Examples
Numeric types (INT
, UINT
, FLOAT
, DOUBLE
)
numeric
123 -5 45.67 2.0e-0.5
UINT
GSQL_UINT_MAX
INT
GSQL_INT_MAX
GSQL_INT_MIN
boolean
TRUE
FALSE
string
stringLiteral
"atoz@com"
"0.25"
Character or syntax
Description
Example
%
Matches zero or more characters.
%abc%
matches any string which contains the sequence "abc"
.
_
(underscore)
Matches any single character.
_abc_e
matches any 6-character string where the 2nd to 4th characters are "abc"
and the last character is "e"
.
[<char_list>]
Matches any character in a char list. A char list is a concatenated character set, with no separators.
[Tiger]
matches either T
, i
, g
, e
, or r
.
[^<char_list>]
Matches any character NOT in a char list.
[^qxz]
matches any character other than q
, x
, or z
.
[!<char_list>]
Matches any character NOT in a char list.
α-β
(Special syntax within a char list) matches a character in the range from α to β. A char list can have multiple ranges.
[a-mA-M0-3]
matches a letter from a to m, upper or lower case, or a digit from 0 to 3.
\\
(Special syntax within a char list) matches the character \
\\]
(Special syntax within a char list) matches the character ]
No special treatment is needed for [ inside a char list.
%[\\]!]
matches any string which ends with either ]
or !