Formal Grammar for Query Language

This is the definition for the GSQL Query Language syntax. It is defined as a set of rules expressed in EBNF notation.

Notation Used to Define Syntax

This defines the EBNF notation used to describe the syntax. Rules contains terminal and non-terminal symbols. A terminal symbol is a base-level symbol which expresses literal output. All symbols in single or double quotes (e.g., '+', "=", ")", "10") are terminal symbols. A non-terminal symbol is defined as some combination of terminal and non-terminal symbols. The left-hand side of a rule is always a non-terminal; this rule defines the non-terminal. The example rule below defines assignmentStmt (that is, an Assignment Statement) to be a name followed by an equal sign followed by an expression, operator, and expression with a terminating semi-colon. AssignmentStmt, name, and expr are all non-terminals. Additionally, all KEYWORDS are in all-capitals and are terminal symbols. The ":=" is part of EBNF and states the left hand side can be expanded to the right hand side.

EBNF Syntax example: A rule
assignmentStmt := name "=" expr op expr ";"

A vertical bar | in EBNF indicates choice. Choose either the symbol on the left or on the right. A sequence of vertical bars means choose any one of the symbols in the sequence.

EBNF Syntax: vertical bar
op := "+" | "-" | "*" | "/"

Square brackets [ ] indicate an optional part or group of symbols. Parentheses ( ) group symbols together. The rule below defines a constant to be one, two, or three digits preceded by an optional plus or minus sign.

EBNF Syntax: Square brackets and parentheses
constant := ["+" | "-"] (digit | (digit digit) | (digit digit digit))

Star * and plus + are symbols in EBNF for closure. Star means zero or more occurrences, and plus means one or more occurrences. The following defines intConstant to be an optional plus or minus followed by one or more digits. It also defines floatConstant to be an optional plus or minus followed by zero or more digits followed by a decimal followed by one or more digits. The star and plus also can be applied to groups of symbols as in the definition of list. The non-terminal list is defined as a parenthesized list of comma-separated expressions (expr). The list has at least one expression which can be followed by zero or more comma-expression pairs.

EBNF Syntax: square brackets and parentheses
intConstant := ["+" | "-"] digit+
floatConstant := ["+" | "-"] digit* "." digit+
list := "(" expr ["," expr]* ")"

Curly braces { } enclose an optional group of symbols which are repeated zero or more times. Therefore, curly braces are equivalent to square brackets or parentheses followed by a star + to indicate zero or more repetitions. All of the following expressions are equivalent:

list1 := expr ["," expr]*
list2 += expr ("," expr)*
list3 := expr {"," expr}

For brevity, the literal comma is sometimes shown without quotation marks:

list4 := expr {, expr}

GSQL Query Language EBNF

#########################################################
## EBNF for GSQL Query Language
createQuery := CREATE [OR REPLACE][DISTRIBUTED] QUERY queryName
"(" [parameterList] ")"
[FOR GRAPH graphName]
[RETURNS "(" baseType | accumType ")"]
[API "(" stringLiteral ")"]
[SYNTAX syntaxName]
"{" queryBody "}"
interpretQuery := INTERPRET QUERY "(" ")"
[FOR GRAPH graphName]
[SYNTAX syntaxName]
"{" queryBody "}"
parameterValueList := parameterValue [, parameterValue]*
parameterValue := parameterConstant
| "[" parameterValue [, parameterValue]* "]" // BAG or SET
| "(" stringLiteral, stringLiteral ")" // a generic VERTEX value
parameterConstant := numeric | stringLiteral | TRUE | FALSE
parameterList := parameterType paramName ["=" constant]
["," parameterType paramName ["=" constant]]*
syntaxName := name
queryBody := [typedefs] [declStmts] [declExceptStmts] queryBodyStmts
typedefs := (typedef ";")+
declStmts := (declStmt ";")+
declStmt := baseDeclStmt | accumDeclStmt | fileDeclStmt
declExceptStmts := (declExceptStmt ";")+
queryBodyStmts := (queryBodyStmt ";")+
queryBodyStmt := assignStmt // Assignment
| vSetVarDeclStmt // Declaration
| lAccumAssignStmt // Assignment
| gAccumAssignStmt // Assignment
| gAccumAccumStmt // Assignment
| funcCallStmt // Function Call
| selectStmt // Select
| queryBodyCaseStmt // Control Flow
| queryBodyIfStmt // Control Flow
| queryBodyWhileStmt // Control Flow
| queryBodyForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| updateStmt // Data Modification
| insertStmt // Data Modification
| queryBodyDeleteStmt // Data Modification
| printStmt // Output
| printlnStmt // Output
| logStmt // Output
| returnStmt // Output
| raiseStmt // Exception
| tryStmt // Exception
installQuery := INSTALL QUERY [installOptions] ( "*" | ALL |queryName [, queryName]* )
runQuery := (RUN | INTERPRET) QUERY [runOptions] queryName "(" parameterValueList ")"
showQuery := SHOW QUERY queryName
dropQuery := DROP QUERY ( "*" | ALL | queryName [, queryName]* )
#########################################################
## Types and names
lowercase := [a-z]
uppercase := [A-Z]
letter := lowercase | uppercase
digit := [0-9]
integer := ["-"]digit+
real := ["-"]("." digit+) | ["-"](digit+ "." digit*)
numeric := integer | real
stringLiteral := '"' [~["] | '\\' ('"' | '\\')]* '"'
name := (letter | "_") [letter | digit | "_"]* // Can be a single "_" or start with "_"
graphName := name
queryName := name
paramName := name
vertexType := name
edgeType := name
accumName := name
vertexSetName := name
attrName := name
varName := name
tupleType := name
fieldName := name
funcName := name
type := baseType | tupleType | accumType | STRING COMPRESS
baseType := INT
| UINT
| FLOAT
| DOUBLE
| STRING
| BOOL
| VERTEX ["<" vertexType ">"]
| EDGE
| JSONOBJECT
| JSONARRAY
| DATETIME
filePath := paramName | stringLiteral
typedef := TYPEDEF TUPLE "<" tupleFields ">" tupleType
tupleFields := (baseType fieldName) | (fieldName baseType)
["," (baseType fieldName) | (fieldName baseType)]*
parameterType := baseType
| [ SET | BAG ] "<" baseType ">"
| FILE
#########################################################
## Accumulators
accumDeclStmt := accumType "@"accumName ["=" constant]
[, "@"accumName ["=" constant]]*
| [STATIC] accumType "@@"accumName ["=" constant]
[, "@@"accumName ["=" constant]]*
accumType := "SumAccum" "<" ( INT | FLOAT | DOUBLE | STRING | STRING COMPRESS) ">"
| "MaxAccum" "<" ( INT | FLOAT | DOUBLE ) ">"
| "MinAccum" "<" ( INT | FLOAT | DOUBLE ) ">"
| "AvgAccum"
| "OrAccum"
| "AndAccum"
| "BitwiseOrAccum"
| "BitwiseAndAccum"
| "ListAccum" "<" type ">"
| "SetAccum" "<" elementType ">"
| "BagAccum" "<" elementType ">"
| "MapAccum" "<" elementType "," (baseType | accumType | tupleType) ">"
| "HeapAccum" "<" tupleType ">" "(" simpleSize "," fieleName [ASC | DESC]
["," fieldName [ASC | DESC]]* ")"
| "GroupByAccum" "<" elementType fieldName ["," elementType fieldName]* ,
accumType fieldName ["," accumType fieldName]* ">"
| "ArrayAccum" "<" accumName ">"
elementType := baseType | tupleType | STRING COMPRESS
gAccumAccumStmt := "@@"accumName "+=" expr
###############################################################################
## Operators, Functions, and Expressions
constant := numeric | stringLiteral | TRUE | FALSE | GSQL_UINT_MAX
| GSQL_INT_MAX | GSQL_INT_MIN | TO_DATETIME "(" stringLiteral ")"
mathOperator := "*" | "/" | "%" | "+" | "-" | "<<" | ">>" | "&" | "|"
condition := expr
| expr comparisonOperator expr
| expr [ NOT ] IN setBagExpr
| expr IS [ NOT ] NULL
| expr BETWEEN expr AND expr
| "(" condition ")"
| NOT condition
| condition (AND | OR) condition
| (TRUE | FALSE)
comparisonOperator := "<" | "<=" | ">" | ">=" | "==" | "!="
expr := name
| "@@"accumName
| name "." "type"
| name "." name
| name "." "@"accumName ["\'"]
| name "." name "." name "(" [argList] ")"
| name "." name "(" [argList] ")" [ ".".FILTER "(" condition ")" ]
| name ["<" type ["," type]* ">"] "(" [argList] ")"
| name "." "@"accumName ("." name "(" [argList] ")")+ ["." name]
| "@@"accumName ("." name "(" [argList] ")")+ ["." name]
| COALESCE "(" [argList] ")"
| ( COUNT | ISEMPTY | MAX | MIN | AVG | SUM ) "(" setBagExpr ")"
| expr mathOperator expr
| "-" expr
| "(" expr ")"
| "(" argList "->" argList ")" // key value pair for MapAccum
| "[" argList "]" // a list
| constant
| setBagExpr
| name "(" argList ")" // function call or a tuple object
setBagExpr := name
| "@@"accumName
| name "." name
| name "." "@"accumName
| name "." "@"accumName ("." name "(" [argList] ")")+
| name "." name "(" [argList] ")" [ ".".FILTER "(" condition ")" ]
| "@@"accumName ("." name "(" [argList] ")")+
| setBagExpr (UNION | INTERSECT | MINUS) setBagExpr
| "(" argList ")"
| "(" setBagExpr ")"
#########################################################
## Declarations and Assignments ##
## Declarations ##
baseDeclStmt := baseType name ["=" constant][, name ["=" constant]]*
fileDeclStmt := FILE fileVar "(" filePath ")"
fileVar := name
localVarDeclStmt := baseType varName "=" expr
vSetVarDeclStmt := vertexSetName ["(" vertexType ")"]
"=" (seedSet | simpleSet | selectBlock)
simpleSet := vertexSetName | "(" simpleSet ")"
| simpleSet (UNION | INTERSECT | MINUS) simpleSet
seedSet := "{" [seed ["," seed ]*] "}"
seed := '_'
| ANY
| vertexSetName
| "@@"accumName
| vertexType ".*"
| paramName
| "SelectVertex" selectVertParams
selectVertParams := "(" filePath "," columnId "," (columnId | name) ","
stringLiteral "," (TRUE | FALSE) ")" [".".FILTER "(" condition ")"]
columnId := "$" (integer | stringLiteral)
## Assignment Statements ##
assignStmt := name "=" expr
| name "." attrName "=" expr
attrAccumStmt := name "." attrName "+=" expr
lAccumAssignStmt := vertexAlias "." "@"accumName ("+="| "=") expr
gAccumAssignStmt := "@@"accumName ("+=" | "=") expr
loadAccumStmt := "@@"accumName "=" "{" "LOADACCUM" loadAccumParams
["," "LOADACCUM" loadAccumParams]* "}"
loadAccumParams := "(" filePath "," columnId "," [columnId ","]*
stringLiteral "," (TRUE | FALSE) ")" [".".FILTER "(" condition ")"]
## Function Call Statement ##
funcCallStmt := name ["<" type ["," type"]* ">"] "(" [argList] ")"
| "@@"accumName ("." funcName "(" [argList] ")")+
argList := expr ["," expr]*
#########################################################
## Select Statement
selectStmt := vertexSetName "=" selectBlock
selectBlock := SELECT alias [fromClause]
[sampleClause]
[whereClause]
[accumClause]
[postAccumClause]*
[havingClause]
[orderClause]
[limitClause]
fromClause := FROM (step | stepV2 | pathPattern ["," pathPattern]*)
step := stepSourceSet ["-" "(" stepEdgeSet ")" "-"[">"] stepVertexSet]
stepV2 := stepVertexSet ["-" "(" stepEdgeSet ")" "-" stepVertexSet]
stepSourceSet := vertexSetName [":" vertexAlias]
stepEdgeSet := [stepEdgeTypes] [":" edgeAlias]
stepVertexSet := [stepVertexTypes] [":" vertexAlias]
alias := (vertexAlias | edgeAlias)
vertexAlias := name
edgeAlias := name
stepEdgeTypes := atomicEdgeType | "(" edgeSetType ["|" edgeSetType]* ")"
atomicEdgeType := "_" | ANY | edgeSetType
edgeSetType := edgeType | paramName | "@@"accumName
stepVertexTypes := atomicVertexType | "(" vertexSetType ["|" vertexSetType]* ")"
atomicVertexType := "_" | ANY | vertexSetType
vertexSetType := vertexType | paramName | "@@"accumName
#----------# Pattern Matching #----------#
pathPattern := stepVertexSet ["-" "(" pathEdgePattern ")" "-" stepVertexSet]*
pathEdgePattern := atomicEdgePattern
| "(" pathEdgePattern ")"
| pathEdgePattern "." pathEdgePattern
| disjPattern
| starPattern
atomicEdgePattern := atomicEdgeType
| atomicEdgeType ">"
| "<" atomicEdgeType
disjPattern := atomicEdgePattern ("|" atomicEdgePattern)*
starPattern := ([atomicEdgePattern] | "(" disjPattern ")") "*" [starBounds]
starBounds := CONST_INT ".." CONST_INT
| CONST_INT ".."
| ".." CONST_INT
| CONST_INT
#--------------------------------------#
sampleClause := SAMPLE ( expr | expr "%" ) EDGE WHEN condition
| SAMPLE expr TARGET WHEN condition
| SAMPLE expr "%" TARGET PINNED WHEN condition
whereClause := WHERE condition
accumClause := [perClauseV2] ACCUM dmlSubStmtList
perClauseV2 := PER "(" alias ["," alias] ")"
postAccumClause := POST-ACCUM dmlSubStmtList
dmlSubStmtList := dmlSubStmt ["," dmlSubStmt]*
dmlSubStmt := assignStmt // Assignment
| funcCallStmt // Function Call
| gAccumAccumStmt // Assignment
| lAccumAccumStmt // Assignment
| attrAccumStmt // Assignment
| vAccumFuncCall // Function Call
| localVarDeclStmt // Declaration
| dmlSubCaseStmt // Control Flow
| dmlSubIfStmt // Control Flow
| dmlSubWhileStmt // Control Flow
| dmlSubForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| insertStmt // Data Modification
| dmlSubDeleteStmt // Data Modification
| printlnStmt // Output
| logStmt // Output
vAccumFuncCall := vertexAlias "." "@"accumName ("." funcName "(" [argList] ")")+
havingClause := HAVING condition
orderClause := ORDER BY expr [ASC | DESC] ["," expr [ASC | DESC]]*
limitClause := LIMIT ( expr | expr "," expr | expr OFFSET expr )
#########################################################
## Control Flow Statements ##
queryBodyIfStmt := IF condition THEN queryBodyStmts
[ELSE IF condition THEN queryBodyStmts ]*
[ELSE queryBodyStmts ] END
dmlSubIfStmt := IF condition THEN dmlSubStmtList
[ELSE IF condition THEN dmlSubStmtList ]*
[ELSE dmlSubStmtList ] END
queryBodyCaseStmt := CASE (WHEN condition THEN queryBodyStmts)+ [ELSE queryBodyStmts] END
| CASE expr (WHEN constant THEN queryBodyStmts)+ [ELSE queryBodyStmts] END
dmlSubCaseStmt := CASE (WHEN condition THEN dmlSubStmtList)+ [ELSE dmlSubStmtList] END
| CASE expr (WHEN constant THEN dmlSubStmtList)+ [ELSE dmlSubStmtList] END
queryBodyWhileStmt := WHILE condition [LIMIT simpleSize] DO queryBodyStmts END
dmlSubWhileStmt := WHILE condition [LIMIT simpleSize] DO dmlSubStmtList END
simpleSize := integer | varName | paramName
queryBodyForEachStmt := FOREACH forEachControl DO queryBodyStmts END
dmlSubForEachStmt := FOREACH forEachControl DO dmlSubStmtList END
forEachControl := ( iterationVar | "(" keyVar (, valueVar)+ ")") (IN | ":") setBagExpr
| iterationVar IN RANGE "[" expr , expr"]" [".STEP(" expr ")"]
iterationVar := name
keyVar := name
valueVar := name
#########################################################
## Other Data Modifications Statements ##
queryBodyDeleteStmt := DELETE alias FROM pattern [whereClause]
dmlSubDeleteStmt := DELETE "(" alias ")"
updateStmt := UPDATE alias FROM pattern SET dmlSubStmtList [whereClause]
insertStmt := insertVertexStmt | insertEdgeStmt
insertVertexStmt := INSERT INTO (vertexType | name)
["(" PRIMARY_ID ["," attrName]* ")"]
VALUES "(" ( "_" | expr ) ["," ("_" | expr)]*] ")"
insertEdgeStmt := INSERT INTO (edgeType | EDGE name)
["(" FROM "," TO ["," attrName]* ")"]
VALUES "(" ( "_" | expr ) [vertexType]
["," ( "_" | expr ) [vertexType] ["," ("_" | expr)]*] ")"
#########################################################
## Output Statements ##
printStmt := PRINT printExpr {, printExpr} [WHERE condition] [TO_CSV (filePath | fileVar)]
printExpr := (expr | vExprSet) [ AS jsonKey]
vExprSet := expr "[" vSetProj {, vSetProj} "]"
vSetProj := expr [ AS jsonKey]
jsonKey := name
printlnStmt := fileVar".println" "(" expr {, expr} ")"
logStmt := LOG "(" condition "," argList ")"
returnStmt := RETURN expr
#########################################################
## Exception Statements ##
declExceptStmt := EXCEPTION exceptVarName "(" errorInt ")"
exceptVarName := name
errorInt := integer
raiseStmt := RAISE exceptVarName [errorMsg]
errorMsg := "(" expr ")"
tryStmt := TRY queryBodyStmts EXCEPTION caseExceptBlock+ [elseExceptBlock] END ";"
caseExceptBlock := WHEN exceptVarName THEN queryBodyStmts
elseExceptBlock := ELSE queryBodyStmts