Data Types

This section describes the data types that are native to and are supported by the GSQL Query Language. Most of the data objects used in queries come from one of three sources:

The query’s input parameters
The vertices, edges, and their attributes which are encountered when traversing the graph
The variables defined within the query to assist in the computational work of the query

This section covers the following subset of the EBNF language definitions:

EBNF for Data Types

lowercase          := [a-z]
uppercase          := [A-Z]
letter             := lowercase | uppercase
digit              := [0-9]
integer            := ["-"]digit+
real               := ["-"]("."digit+) | ["-"](digit+"."digit*)
numeric            := integer | real
stringLiteral      := '"' [~["] | '\\' ('"' | '\\')]* '"'

name := (letter | "_") [letter | digit | "_"]*   // Can be a single "_" or start with "_"
graphName := name
queryName := name
paramName := name
vertexType := name
edgeType := name
accumName := name
vertexSetName := name
attrName := name
varName := name
tupleType := name
fieldName :=name
funcName := name

type := baseType | tupleType | accumType

baseType := INT
          | UINT
          | FLOAT
          | DOUBLE
          | STRING
          | BOOL
          | VERTEX ["<" vertexType ">"]
          | EDGE
          | JSONOBJECT
          | JSONARRAY
          | DATETIME

filePath := paramName | stringLiteral

typedef := TYPEDEF TUPLE "<" tupleFields ">" tupleType

tupleFields := (baseType fieldName) | (fieldName baseType)
           ["," (baseType fieldName) | (fieldName baseType)]*

createVirtualEdge := CREATE DIRECTED|UNDIRECTED VIRTUAL EDGE Virtual_Edge_Type_Name "("
    FROM Vertex_Type_Name ("|" Vertex_Type_Name)* ","
    TO Vertex_Type_Name ("|" Vertex_Type_Name)*
    ["," attribute_name type [DEFAULT default_value]]* ")"

parameterType := baseType
               | [ SET | BAG ] "<" baseType ">"
               | FILE

The legacy data type STRING COMPRESS has been deprecated since TigerGraph Server 3.0. You can no longer create new schema with the type STRING COMPRESS. As such, we have removed STRING COMPRESS from the documentation.

Existing schemas that are using STRING COMPRESS can continue to function normally. If you need reference for STRING COMPRESS, please refer to GSQL Language Reference version 3.5 or older.

Identifiers

An identifier is the name for an instance of a language element. In the GSQL query language, identifiers are used to name elements such as a query, a variable, or a user-defined function. In the EBNF syntax, an identifier is referred as name. It can be a sequence of letters, digits, or underscores ("_"). Other punctuation characters are not supported. The initial character can only be a letter or an underscore.

name (identifier)

name := (letter | "_") [letter | digit | "_"]*

Overview

Different types of data can be used in different contexts. The EBNF syntax defines several classes of data types. The most basic is called base type (baseType). The other independent type is FILE. The remaining types are either compound data types built from the independent data types, or supersets of other types. The table below gives an overview of their definitions and their uses.

EBNF term Description Use Case

EBNF term	Description	Use Case
`baseType`	`INT`, `UINT`, `FLOAT`, `DOUBLE`, `STRING`, `BOOL`, `DATETIME,` `VERTEX`, `EDGE`, + `JSONOBJECT`, or `JSONARRAY`	Base type variable Query return value
`containerType`	`SET`, `BAG`, `LIST`, `MAP`	Local container variable
`tupleType`	Sequence of base types	User-defined tuple
`accumType`	Family of specialized data objects which support accumulation operations	Accumulate and aggregate data, when traversing a set of vertices or edges (Details are in the Query Lang Spec - Accumulators chapter.)
`FILE`	`FILE` object	Global sequential data object, linked to a text file
`parameterType`	`baseType` (except `EDGE` or `JSONOBJECT`), a set or bag of `baseType`, or `FILE` object	Query parameter
`elementType`	`baseType`, or identifier	Element for most types of container accumulators: `SetAccum`, `BagAccum`, `GroupByAccum`, key of a `MapAccum` element
`type`	`baseType`, identifier, or `accumType`	Element of a `ListAccum`, value of a `MapAccum` element Local variable

baseType

INT, UINT, FLOAT, DOUBLE, STRING, BOOL, DATETIME, VERTEX, EDGE,
+ JSONOBJECT, or JSONARRAY

Base type variable
Query return value

containerType

SET, BAG, LIST, MAP

Local container variable

tupleType

Sequence of base types

User-defined tuple

accumType

Family of specialized data objects which support accumulation operations

Accumulate and aggregate data, when traversing a set of vertices or edges

(Details are in the Query Lang Spec - Accumulators chapter.)

FILE

FILE object

Global sequential data object, linked to a text file

parameterType

baseType (except EDGE or JSONOBJECT), a set or bag of baseType, or FILE object

Query parameter

elementType

baseType, or identifier

Element for most types of container accumulators: SetAccum, BagAccum, GroupByAccum, key of a MapAccum element

type

baseType, identifier, or accumType

Element of a ListAccum, value of a MapAccum element
Local variable

Base types

The query language supports the following base types, which can be declared and assigned anywhere within their scope. Any of these base types may be used when defining a global variable, a local variable, a query return value, a parameter, part of a tuple, or an element of a container accumulator. Accumulators are described in detail in a later section.

EBNF

baseType := INT
          | UINT
          | FLOAT
          | DOUBLE
          | STRING
          | BOOL
          | VERTEX ["<" vertexType ">"]
          | EDGE
          | JSONOBJECT
          | JSONARRAY
          | DATETIME

The default value of each base type is shown in the table below. The default value is the initial value of a base type variable (see Section "Variable Types" for more details), or the default return value for some functions.

The first seven types (INT, UINT, FLOAT, DOUBLE, BOOL, STRING, and DATETIME) are the same ones mentioned in the "Attribute Data Types" section of GSQL Language Reference, Part 1.

Type Default value Literal

Type	Default value	Literal
`INT`	`0`	A signed integer: `-3`
`UINT`	`0`	An unsigned integer: `5`
`FLOAT`	`0`	A decimal precise up to about 7 digits: `3.14159` NOTE: For large numbers, `DOUBLE` is better.
`DOUBLE`	`0`	A decimal precise up to about 16 digits.
`BOOL`	`false`	`TRUE` or `FALSE`
`STRING`	`""`	Characters enclosed by double quotes: `"Hello"`
`DATETIME`	`1970-01-01 00:00:00`	No literal. Can be converted from a correctly formatted string with `to_datetime()`.
`VERTEX`	`"Unknown"`	No literal.
`EDGE`	No edge: `{}`	No literal.
`JSONOBJECT`	An empty object: `{}`	No literal. Can be converted from a correctly formatted string with `parse_json_object()`.
`JSONARRAY`	An empty array: `[]`	No literal. Can be converted from a correctly formatted string with `parse_json_array()`.

INT

0

A signed integer: -3

UINT

0

An unsigned integer: 5

FLOAT

0

A decimal precise up to about 7 digits: 3.14159

NOTE: For large numbers, DOUBLE is better.

DOUBLE

0

A decimal precise up to about 16 digits.

BOOL

false

TRUE or FALSE

STRING

""

Characters enclosed by double quotes: "Hello"

DATETIME

1970-01-01 00:00:00

No literal. Can be converted from a correctly formatted string with to_datetime().

VERTEX

"Unknown"

No literal.

EDGE

No edge: {}

No literal.

JSONOBJECT

An empty object: {}

No literal. Can be converted from a correctly formatted string with parse_json_object().

JSONARRAY

An empty array: []

No literal. Can be converted from a correctly formatted string with parse_json_array().

The FLOAT and DOUBLE data types have the same precision limits as in C++. See Common Errors and Problems for more details.

Support for fixed and exponential notation

FLOAT and DOUBLE input values must be in fixed point d.dddd format, where d is a digit. Output values will be printed in either fixed point or exponential notation, whichever is more compact.

The GSQL Loader can read FLOAT and DOUBLE values with exponential notation (e.g., 1.25 E-7).

Vertex

VERTEX is considered a base type in the GSQL query language. Both query parameters and variables in a query body can be of type VERTEX.

Vertex types

A graph schema defines specific vertex types. Each vertex type has its own set of attributes. The parameter or variable type can be restricted by giving the vertex type in angle brackets <> after the keyword VERTEX. A vertex variable declared without a specifier is called a generic vertex variable.

Generic and typed vertex variables

VERTEX anyVertex;
VERTEX<person> owner;

All vertices have a built-in attribute type. The built-in attribute is of type string. You can access it with the dot (.) operator.

For example, if you declare a vertex variable VERTEX<person> personVertex, then personVertex.type returns "person".

When you run a query with a vertex type, the syntax is different for typed vertex parameters and generic vertex parameters. See the RUN QUERY parameter list for details.

Edge

EDGE is considered a base type in the GSQL query language. Both query parameters and variables in a query body can be of type EDGE.

Edge types

A graph schema defines specific edge types. Each edge type has its own set of attributes. The parameter or variable type can be restricted by giving the edge type in angle brackets <> after the keyword EDGE. An edge variable declared without a specifier is called a generic edge variable.

Generic and typed edge variables

EDGE anyEdge;
EDGE<friendship> friendEdge;

All edges have a built-in attribute type. The built-in attribute is of type string. You can access it with the dot (.) operator.

For example, if you define an edge variable EDGE<friendship> friendEdge, then friendEdge.type returns "Friendship".

Vertex and Edge Attribute Types

The following table maps vertex or edge attribute types in the Data Definition Language (DDL) to GSQL query language types. If an attribute of a vertex or edge is referenced in a GSQL query, they will be automatically converted to their corresponding data type in the GSQL query language.

DDL GSQL Query

DDL	GSQL Query
`INT`	`INT`
`UINT`	`UINT`
`FLOAT`	`FLOAT`
`DOUBLE`	`DOUBLE`
`BOOL`	`BOOL`
`STRING`	`STRING`
`SET< type >`	`SetAccum< type >`
`LIST< type >`	`ListAccum< type >`
`MAP <key_type, value_type>`	`MapAccum <key_type, value_type>`
`DATETIME`	`DATETIME`

INT

UINT

FLOAT

DOUBLE

BOOL

STRING

SET< type >

SetAccum< type >

LIST< type >

ListAccum< type >

MAP <key_type, value_type>

MapAccum <key_type, value_type>

DATETIME

`JSONOBJECT`

A JSONOBJECT instance’s external representation (as input and output) is a string, starting and ending with curly braces ({}) which enclose an unordered list of key-value pairs. JSONOBJECT is immutable. No operator is allowed to alter its value.

Assign a `JSONOBJECT` value to a base type variable

There is no JSONOBJECT literal in the GSQL query language. Therefore, to assign value to a JSONOBJECT base type variable in GSQL, you need to use the parse_json_object() function to convert a string representation of s JSON object to a JSONOBJECT value.

For example, the following line declares a variable han with a JSONOBJECT value:

JSONOBJECT han = parse_json_object("{\"f_name\": \"Han\", \"l_name\": \"Solo\"}");

`JSONARRAY`

A JSONARRAY is represented as a string, starting and ending with square brackets ([])which enclose an ordered list of values. JSONARRAY is immutable. No operator is allowed to alter its value.

Assign a `JSONARRAY` value to a base type variable

There is no JSONARRAY literal in the GSQL query language. Therefore, to assign value to a JSONARRAY base type variable in GSQL, you need to use the parse_json_array() function to convert a string representation of s JSON object to a JSONARRAY value.

For example, the following line declares a variable fruits with a JSONARRAY value:

JSONARRAY fruits = parse_json_array("[\"apple\", \"banana\", \"citrus\"]");

Container types

Container types include the following data types:

SET
BAG
LIST
MAP

Table 1. Container types
Type	Literal	Example	Default value
`SET`	Elements enclosed by parentheses, separated by commas.	`(1, 2, 3)`	Empty set.
`BAG`	Elements enclosed by parentheses, separated by commas.	`(1, 1, 2)`	Empty bag.
`LIST`	Elements enclosed by square brackets, separated by commas.	`[1, 2, 3]`	Empty list.
`MAP`	No literal.	N/A	Empty map.

To see how container type variables are declared in a GSQL query, see Local container variables.

Tuple

A tuple is a user-defined data structure consisting of a fixed sequence of base type variables. Tuple types can be created and named using a TYPEDEF statement. Tuples must be defined first, before any other statements in a query.

EBNF for tuples

typedef := TYPEDEF TUPLE "<" tupleFields ">" tupleType

tupleFields := (baseType fieldName) | (fieldName baseType)
           ["," (baseType fieldName) | (fieldName baseType)]*

A tuple can also be defined in a graph schema and then can be used as a vertex or edge attribute type. A tuple type that has been defined in the graph schema does not need to be re-defined in a query.

The vertex type person contains two complex attributes:

secretInfo of type SECRET_INFO, which a user-defined tuple
portfolio of type MAP<STRING, DOUBLE>

Investment_Net Schema

TYPEDEF TUPLE <age UINT (4), mothers_name STRING(20) > SECRET_INFO

CREATE VERTEX Person(PRIMARY_ID person_id STRING, portfolio MAP<STRING, DOUBLE>, secret_info SECRET_INFO)

CREATE VERTEX Stock_Order(PRIMARY_ID order_id STRING, ticker STRING, order_size UINT, price FLOAT)

CREATE UNDIRECTED EDGE make_order(FROM Person, TO Stock_Order, order_time DATETIME)

CREATE GRAPH Investment_Net (*)

The query below reads both the SECRET_INFO tuple and the portfolio MAP. The tuple type does not need to be redefined in the query. To read and save the map, we define a MapAccum with the same types for key and value as the portfolio attribute. In addition, the query creates a new tuple type, ORDER_RECORD.

Query
Result

CREATE QUERY tuple_ex(VERTEX<Person> p) FOR GRAPH Investment_Net{

    TYPEDEF TUPLE <STRING ticker, FLOAT price, DATETIME order_time> Order_Record; (1)

    SetAccum<Secret_Info> @@info; (2)
    ListAccum<Order_Record> @@order_records;
    MapAccum<STRING, DOUBLE> @@portf;

    INIT = {p};

    // Get person p's secret_info and portfolio
    X = SELECT v FROM INIT:v
        ACCUM @@portf += v.portfolio, @@info += v.secret_info;

    // Search person p's orders to record ticker, price, and order time.
    // Note that the tuple gathers info from both edges and vertices.
    orders = SELECT t
        FROM INIT:s -(Make_Order:e)-Stock_Order:t
        ACCUM @@order_records += Order_Record(t.ticker, t.price, e.order_time);

    PRINT @@portf, @@info;
    PRINT @@order_records;
}

1	This statement defines a new tuple `Order_Record` at the top of the query.
2	`Secret_Info` has already been defined in `Investment_Net` Schema.

GSQL > RUN QUERY tuple_ex("person1")
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [
    {
      "@@info": [{
        "mothers_name": "JAMES",
        "age": 25
      }],
      "@@portf": {
        "AAPL": 3142.24,
        "MS": 5000,
        "G": 6112.23
      }
    },
    {"@@order_records": [
      {
        "ticker": "B",
        "price": 202.32001,
        "order_time": "2017-03-03 18:42:30"
      },
      {
        "ticker": "AAPL",
        "price": 34.42,
        "order_time": "2017-03-03 18:42:28"
      },
      {
        "ticker": "A",
        "price": 50.55,
        "order_time": "2017-03-03 18:42:29"
      }
    ]}
  ]
}

Virtual Edge

In the graph schema, while regular edges facilitate CRUD operations, traversing multiple steps between vertices can be cumbersome.

To streamline this process, we are introducing the Virtual Edge feature, which are in memory edges that can be created during query runtime, simplifying traversal and enabling predicate application across non-adjacent vertices.

EBNF for creating virtual edge

createVirtualEdge := CREATE DIRECTED|UNDIRECTED VIRTUAL EDGE Virtual_Edge_Type_Name "("
    FROM Vertex_Type_Name ("|" Vertex_Type_Name)* ","
    TO Vertex_Type_Name ("|" Vertex_Type_Name)*
    ["," attribute_name type [DEFAULT default_value]]* ")"

Example

CREATE OR REPLACE DISTRIBUTED QUERY exampleQuery(){
  CREATE DIRECTED VIRTUAL EDGE virtualE1(from v1, to v2, attr1 int);

  CREATE DIRECTED VIRTUAL EDGE virtualE2(from v1|v3|v5, to v2|v4|v6, attr1 int, attr2 float);
  ...
}

Definition scope

Virtual edges can only be defined in the global scope of a query, meaning they must be declared at the top level of the query body. They cannot be nested within local statement such as IF, ELSE, FOREACH, select-block etc. The definition of a virtual edge is visible and valid throughout the entire query.

Access, Manipulation and Traversal

In general, virtual edge is very similar to regular edge, meaning we can update, upsert, access, delete virtual edge just like regular edge.

Query
Result

CREATE OR REPLACE DISTRIBUTED QUERY virtualEdgeExample(){

  // First we create a virtual edge type
  CREATE DIRECTED VIRTUAL EDGE VirtualE1(FROM v1, TO v2, creationDate DATETIME);

  P = SELECT s FROM v1:s -(e1>:e)- :t
         ACCUM
	      // here we insert virtual edges
            INSERT INTO VirtualE1 VALUES(s, t, to_datetime("2024-01-01"));

  SumAccum<INT> @@cntVE1;
  MinAccum<DATETIME> @@minDatetime;
  P = SELECT s FROM v1:s -(VirtualE1>:e)- :t
         ACCUM
		// here we traverse virtual edges just like regular edge
            @@cntVE1 += 1,
            @@minDatetime += e.creationDate;

  PRINT @@cntVE1;
  PRINT @@minDatetime;

  P = SELECT s FROM v1:s -(VirtualE1>:e)- :t
         ACCUM
		// here we delete the virtual edges
            DELETE(e);

  SumAccum<INT> @@cntVE2;
  P = SELECT s FROM v1:s -(VirtualE1>)- :t
         ACCUM
		// here we sum up the number of virtual edges, it should be 0
            @@cntVE2 += 1;
  PRINT @@cntVE2;
}

This query creates a virtual edge type VirtualE1, insert edge, access, and delete just like regular edge

GSQL > RUN QUERY virtualEdgeExample()
{
  "version": {
    "edition": "enterprise",
    "api": "v2",
    "schema": 1
  },
  "error": false,
  "message": "",
  "results": [
    {
      "@@cntVE1": 3
    },
    {
      "@@minDatetime": "2024-01-01 00:00:00"
    },
    {
      "@@cntVE2": 0
    }
  ]
}

Limitation

It is only supported to create virtual edge type in the global scope of query, creating virtual edge type inside statements are not allowed, for example below is an incorrect usage

CREATE OR REPLACE DISTRIBUTED QUERY exampleQuery(Bool condition){

  IF Condition THEN
	// we are creating virtual edge inside if statement, which is not allowed
	CREATE DIRECTED VIRTUAL EDGE VirtualE1(FROM Person, TO Post|Comment, creationDate DATETIME);

	...
  END;
}

Only GPR v2 syntax support virtual edge, while UDF does not.
Virtual Edge does not support multi edge, i.e. no DISCRIMINATOR attribute can be added in virtual edge
Virtual Edge types cannot be used together with other virtual edge or regular edges in edge traversal, for example below is an incorrect usage

R = select s from Person:s -((VirtualE1>|VirtualE2>):e)- :t
    accum
      case e.type
        when "VirtualE1" then @@results1 += Results01(e._a, e._b, e._c, e._d)
        when "VirtualE2" then @@results2 += Results02(e._a, e._b)
      end
    ;

Does not support virtual edge with getAttr(), setAttr() functions
Cannot aggregate virtual edge into edge containers, for example below is an incorrect usage

ListAccum<edge> @@res;
R = SELECT p1 FROM person:p1 -(FOF:e)- person:p2
     ACCUM
       @@res += e;

Does not support user defined tuple as attribute type
Does not support update and access attributes of same Virtual Edge type in the same ACCUM clause.
- Correct example

// we are reading the attribute of virtual edge in ACCUM clause, and
// insert(update) virtual edge in POST-ACCUM caluse, which is allowed

W = select s from Src:s - (VE:e) - Tgt:t
        ACCUM @@cnt +=e.attr1;
	 POST-ACCUM INSERT INTO VE values(s, t);

Incorrect example

// we are reading the attribute of virtual edge and also updating it in
// the same ACCUM clause, which is not allowed

W = select s from Src:s - (VE:e) - Tgt:t
        ACCUM @@cnt +=e.attr1,
 e.attr1 += 1;

`FILE` Object

A FILE object is a sequential data storage object, associated with a text file on the local machine or with an S3 bucket.

When referring to a FILE object, we always capitalize the word FILE to distinguish it from ordinary files.

Local disk file

When a FILE object is declared, associated with a particular text file, any existing content in the text file will be erased. During the execution of the query, content written to the FILE will be appended to the FILE. When the query where the FILE was declared finishes running, the FILE contents are saved to the text file.

S3 object

Define s3 file object

The path should start with s3://, followed by the bucket name and the S3 path, e.g., s3://bucket-name/queryoutput/output.csv. During the execution of the query, content will be uploaded to the S3 bucket. Note that the S3 object cannot be modified or appended, if an S3 object with the same path already exists, it will be overwritten.

Set S3 connection credentials

The S3 credentials can be set as GSQL session parameters, so they persist for a user for a full session.

set s3_aws_access_key_id = <AWS_KEY_ID>;
set s3_aws_secret_access_key = <AWS_ACCESS_KEY>;

These session parameters should be set within the GSQL Editor to enable read/write access to the specified S3 bucket for query results. Replace <AWS_KEY_ID> and <AWS_ACCESS_KEY> with your actual AWS credentials.

Output

Since S3 is a shared storage system, multiple nodes in a cluster can upload to the same S3 bucket. To handle potential conflicts and ensure unique output files, the S3 path can include a suffix based on the instance name, such as _GPE_{PartitionId}_{ReplicaId}. For distributed queries, additional suffixes will be used to differentiate between the manager and worker roles on the same GPE. Specifically, suffixes like _coordinator and _worker will be added, where _coordinator refers to the worker manager and _worker refers to the worker node.

Error code

For S3 bucket connection errors, refer to error code GSQL-5301.

A FILE object can be passed as a parameter to another query. When a query receives a FILE object as a parameter, for a file on the local machine, it can append data to that FILE, as can every other query which receives this FILE object as a parameter. However, an S3 bucket FILE object cannot be appended to. When you write to an S3 path, any existing object will be overwritten.

Query parameter types

A query can have one or more input parameters having any of the following types:

Base types (except EDGE , JSONARRAY, or JSONOBJECT).
FILE object
A SET or BAG of base type elements (except EDGE , JSONARRAY, or JSONOBJECT).
As of 4.2, LIST and MAP of base type elements are also supported. The key and value of a MAP may be primitiveType (numerical, string, or Boolean).

Within the query, SET, BAG, LIST, and MAP parameters are converted to SetAccum, BagAccum, ListAccum, and MapAccum, respectively.

A query parameter is immutable. It cannot be assigned a new value within the query.

The FILE object is a special case. It is passed by reference, meaning that the receiving query gets a link to the original FILE object. The receiving query can write to the FILE object.