1 of 4

Advanced Features

This section includes some advanced features related to pattern match. It includes using the PER clause to fine control the ACCUM execution, DML support in pattern match, and the conjunctive pattern match syntax which allows multiple patterns in one FROM clause. We dedicate a subsection for each topic.

Per Clause (Beta)

Introduction

Pattern matching produces a virtual match table, and the ACCUM clause acts like a FOREACH loop, executing the clause's statement once for each row of the match table.

Patterns are paths in the graphs, and each row in the match table is a distinct path. However, paths may share some vertices or edges. Some applications do not want to do aggregations per path. Instead, they want to execute the ACCUM clause per distinct group of vertex aliases.

For example, consider the following query which counts the number of paths in a simple 2-hop pattern:

SumAccum<int> @@cnt; 

S = SELECT t 
    FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
    ACCUM @@cnt += 1;

Suppose the query produces the following match table.

(s, edge1, m , edge2, t)//match table schema
v1, e1, v3, e2, v2 //match 1 
v1, e3, v4, e4, v2 //match 2 
v5, e5, v6, e6, v7 //match 3
v8, e7, v9, e8, v7 //match 4

By default, the ACCUM clause will execute the @@cnt += 1statement 4 times, for each row in the match table. The result will be @@cnt = 4.

For the same query, what if the user wants to

count the number of distinct path endings in the match table? For this case, we would want to iterate on the alias t.
count the number of distinct (start, end) pairs in the match table? For that case, we would want to iterate on distinct pairs of the aliases (s, t).

To provide users with this added flexibility and finer control over ACCUM iteration, TigerGraph 3.0 adds the PER clause to pattern matching (V2) syntax.

Syntax

The PER Clause is an optional clause that comes at the start of the ACCUM clause in a SELECT statement. As illustrated below, it starts with the keyword PER, and followed by a pair of parenthesis, in which user can put one or more distinct vertex aliases found in the FROM pattern.

selectBlock := SELECT alias 
               FROM pattern
               [sampleClause]
               [whereClause]
               [[perClause] accumClause]
               [postAccumClause]*
               [havingClause]
               [orderClause]
               [limitClause]
                    
perClause := PER (vertex_alias_1, vertex_alias_2, ...)

Examples. Below are multiple examples of the PER Clause using the same FROM clause.

S1 = SELECT s 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (s) 
     ACCUM @@cnt += 1;
    
S2 = SELECT t 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (t)
     ACCUM @@cnt += 1;

S3 = SELECT m 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (m)
     ACCUM @@cnt += 1;
    
S4 = SELECT t 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (s, t)
     ACCUM @@cnt += 1;
    
S5 = SELECT t 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (s, m, t)
     ACCUM @@cnt += 1;

Semantics

The PER Clause specifies a list of vertex aliases, which are used to group the rows in the match table, one group per distinct value of the alias or of the alias list. If there are N distinct groups, we will execute the ACCUM clause N times, once per distinct vertex aliases' binding. Note that the PER clause has no effect on POST-ACCUM clauses semantic, except confining the POST-ACCUM vertex alias.

Suppose s, m, and t are vertex aliases in a pattern. Below are some interpretations of the PER Clause based on the graph element bindings found in the match table.

PER (s) ACCUM means that per each distinct s vertex, execute the ACCUM clause once.
PER (s,t) ACCUM means that per each distinct (s, t) pair, execute the ACCUM clause once.
PER (s,m,t) ACCUM means that per each distinct (s, m, t) tuple, execute the ACCUM clause once.

Examples to show PER clause semantics.

//match table
(s, edge1, m , edge2, t)//schema
v1, e1, v3, e2, v2 //match 1 
v1, e3, v4, e4, v2 //match 2 
v5, e5, v6, e6, v7 //match 3
v8, e7, v9, e8, v7 //match 4

//since we have v1, v5, and v8 three distinct vertices bind to s, 
//we execute ACCUM clause 3 times.  
S1 = SELECT s 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s) 
     ACCUM @@cnt += 1;
  
//since we have v2, v7 two distinct vertices bind to t, 
//we execute ACCUM clause twice.     
S2 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (t)
     ACCUM @@cnt += 1;
     
//since we have v3, v4, v6, v9 four distinct vertices bind to m, 
//we execute ACCUM clause 4 times. 
S3 = SELECT m
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (m)
     ACCUM @@cnt += 1;
     
//since we have (v1, v2), (v5, v7) and (v8, v7) three distinct vertex pairs 
//bind to (s,t), we execute ACCUM clause 3 times.     
S4 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, t)
     ACCUM @@cnt += 1;

    
//since we have (v1, v3, v2), (v1, v4, v2), (v5, v6, v7) and (v8, v9, v7) four
//distinct vertex groups bind to (s,m,t), we execute ACCUM clause 4 times.      
S5 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, m, t)
     ACCUM @@cnt += 1;

If the PER Clause is used in a SELECT query block, then the vertex aliases used in the SELECT, ACCUM , and POST-ACCUM clauses must be confined to the aliases that appear in the PER clause.

Below are some illegal cases.

//semantic error. SELECT t, but t doesn't appear in PER clause. 
S1 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, m)
     ACCUM @@cnt += 1;
     
//semantic error. ACCUM t.@cnt, but t doesn't appear in PER clause.
S2 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, m)
     ACCUM t.@cnt += 1;
     
//semantic error. POST-ACCUM t.@cnt, but t doesn't appear in PER clause.
S3 = SELECT s
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s)
     ACCUM s.@cnt += 1
     POST-ACCUM t.@cnt =1;

PER Clause Examples

Example 1. Count the number of Countries that has a City which has a resident that likes a post.

//Example 1. 
USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {
  SumAccum<int> @@cnt;

 R =   SELECT c
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       PER    (c)
       ACCUM  @@cnt +=1;

 PRINT @@cnt;
}

//results
Using graph 'ldbc_snb'
The query AA is dropped.
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 111}]
}

Example 2. Count the number of posts liked by a person who is located in a city that belongs to a country. (All cities are in a country, but humor us. We are reusing the same FROM pattern in several examples.)

//Example 2. 
USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {
  SumAccum<int> @@cnt;

 R =   SELECT p
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       PER    (p)
       ACCUM  @@cnt +=1;

 PRINT @@cnt;

//result
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 70668}]
}

Example 3. Find for each country in ("Dominican_Republic","Angola", "Cambodia") the number of posts that is liked by a person living in that country.

//Exmaple 3
USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2{

 MapAccum<string, SumAccum<int>> @@postPerCountry;

 R =   SELECT p
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       WHERE  c.name in  ("Dominican_Republic","Angola", "Cambodia")
       PER    (c, p)
       ACCUM  @@postPerCountry += (c.name -> 1);

 PRINT @@postPerCountry;
}

//results
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@postPerCountry": {
    "Dominican_Republic": 395,
    "Angola": 12,
    "Cambodia": 4002
  }}]
}

Example 4. Find for each country in ("Dominican_Republic","Angola", "Cambodia") the number of posts that is liked by a person living in that country. Use local accumulators this time.

USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2{

 SumAccum<int> @postCnt;

 R =   SELECT c
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       WHERE  c.name in  ("Dominican_Republic","Angola", "Cambodia")
       PER    (c, p) //per (country, post), add 1 to c.@postCnt
       ACCUM  c.@postCnt += 1;

 PRINT R;
}

//results
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"R": [
    {
      "v_id": "2",
      "attributes": {
        "@postCnt": 12,
        "name": "Angola",
        "id": 2,
        "url": "http://dbpedia.org/resource/Angola"
      },
      "v_type": "Country"
    },
    {
      "v_id": "67",
      "attributes": {
        "@postCnt": 4002,
        "name": "Cambodia",
        "id": 67,
        "url": "http://dbpedia.org/resource/Cambodia"
      },
      "v_type": "Country"
    },
    {
      "v_id": "11",
      "attributes": {
        "@postCnt": 395,
        "name": "Dominican_Republic",
        "id": 11,
        "url": "http://dbpedia.org/resource/Dominican_Republic"
      },
      "v_type": "Country"
    }
  ]}]
}

Performance and Best Practices

The PER Clause not only helps users to control the semantics of the ACCUM clause, it also boosts the performance of the pattern match query, as it uses the PER clause to optimize the query execution.

To get the best performance, we recommend three guidelines for writing efficient queries.

Use PER (target) If Possible

Per target is in general faster than Per source. In the example below, query q2 is faster than q1. The only difference between these two queries is q2's FROM pattern is the flip of q1's FROM pattern.

USE GRAPH ldbc_snb

# not recommended, since it does per (src).
CREATE QUERY q1 () SYNTAX v2 {

  SumAccum<int> @@cnt ;

  T = SELECT c
      FROM Comment:c - (<LIKES) - Person:ps - (IS_LOCATED_IN>) - City:city
      WHERE year(c.creationDate) >= 2006
      PER (c)
      ACCUM @@cnt += 1;

  PRINT @@cnt;
}

# recommended, since it does per (tgt)
CREATE QUERY q2 () SYNTAX v2 {

  SumAccum<int> @@cnt ;

  T = SELECT c
      FROM City:city - (<IS_LOCATED_IN) - Person:ps - (LIKES>) - Comment:c
      WHERE year(c.creationDate) >= 2006
      PER (c)
      ACCUM @@cnt += 1;

  PRINT @@cnt;
}

Write Patterns with Smallest Expected Vertex Set on the Left

The match table is built by traversing the pattern from left to right. Follow the basic principle of pruning early rather than late by orienting the query the smaller cardinality set on the left. This practice will result in producing the least number of candidate matches during the query computation. For example, if there are fewer distinct tags than persons, then query q4 is faster than q3.

USE GRAPH ldbc_snb

# not recommended, since the pattern starts from a large cardinality vertex type
# (Person), and ends at a small cardinality vertex type (Tag). 
CREATE QUERY q3 () SYNTAX v2 {

  SumAccum<int> @cnt;

  V = SELECT s
      FROM Person:s- (LIKES>)-Post:p - (<CONTAINER_OF)-:f - (HAS_TAG>) - :t
      PER (s)
      ACCUM s.@cnt += 1;

  PRINT V.size();
}

# recommended, start from small cardinality end (Tag), and use per tgt
CREATE QUERY q4 () SYNTAX v2 {

  SumAccum<int> @cnt;

  V = SELECT s
      FROM Tag:t-(<HAS_TAG)-Forum:f -(CONTAINER_OF>)-Post:p  - (<LIKES)- Person:s
      PER (s)
      ACCUM s.@cnt += 1;

  PRINT V.size();
}

Specify Complete Type Information

Specifying complete type information improves performance. For example, query q6 is faster than q5 even though they are known to be logically identical. Forum is the CONTAINER_OF Post, so it does not need to be specified in q5, but explicitly saying Forum in q6 speeds up performance.

USE GRAPH ldbc_snb

#we do not put Forum befoe :f, even if we know it.
CREATE QUERY q5 () SYNTAX v2 {

  SumAccum<int> @@person_cnt;

  V = SELECT s
      FROM Person:s- (LIKES>)-Post:p - (<CONTAINER_OF)-:f - (HAS_TAG>) - :t
      PER (s)
      ACCUM @@person_cnt += 1;

  PRINT @@person_cnt;
}


#recommended: we put Forum as the type info. 
CREATE QUERY q6 () SYNTAX v2 {

  SumAccum<int> @@person_cnt;

  V = SELECT s
      FROM Person:s- (LIKES>)-Post:p - (<CONTAINER_OF)-Forum:f - (HAS_TAG>) - :t
      PER (s)
      ACCUM @@person_cnt += 1;

  PRINT @@person_cnt;
}

LDBC Benchmark Queries

Using the PER clause and linear regular path pattern, we have translated all of the LDBC-SNB queries. You can find them on github at https://github.com/tigergraph/ecosys/tree/ldbc/ldbc_benchmark/tigergraph/queries_linear/queries. Most of the queries are installed as functions. You can find sample parameter(s) of the functions from https://github.com/tigergraph/ecosys/tree/ldbc/ldbc_benchmark/tigergraph/queries/seeds.

Data Modification

Pattern Matching GSQL supports Insert, Update, and Delete operations. The syntax is identical to that in classic GSQL (v1), though the full range of data modification operations are not yet support.

In general, data modification can be at two levels in GSQL:

Top level. The statement does not need to within any other statement.
Within a SELECT query statement. The FROM-WHERE clauses define a match table, and the data modification is performed based on the vertex and edge information in the match table. The GSQL specifications calls these within-SELECT statements DML-sub statements.

Insert, Update, and Delete currently work in compiled mode only (e.g., you must run INSTALL QUERY before RUN QUERY.) Data Modification in interpreted mode is not yet available.
SELECT queries with data modification may only have one POST-ACCUM clause.

Insert vertices and edges

Pattern matching Insert is supported at both the top-level and within-SELECT levels, using the same syntax as in classic GSQL. You can insert vertices and edges.

Example 1. Create a Person vertex, whose name is Tiger Woods. Next, find Viktor's favorite 2012 posts' authors, whose last name is prefixed with S. Finally, insert KNOWS edges connecting Tiger Woods with Viktor's favorite authors.

You can verify the result by running a simple built-in REST endpoint.

Check the inserted vertex.

Check the inserted edges.

Update data

To update vertex attributes, use assignment statements in a POST-ACCUM clause. To update edge attributes, use assignment statements in an ACCUM clause. In addition, data updates can only be performed if the FROM statement only contains a single-hop and fix-length pattern.

Query-body level UPDATE statements are not yet supported in syntax v2.

Example 2. For all KNOWS edges that connect Viktor Akhiezer and his friends whose lastName begins with "S", update the edge creationDate to "2020-10-01". Also, for the Person vertex (Tiger Woods) update the vertex's creationDate and language he speaks.

To verify the update, we can use REST calls.

Check Tiger Woods' creationDate and language he speaks.

Check KNOWS edges whose source is tiger woods.

Delete vertices and edges

You can use delete () function to delete edges and vertices in ACCUM and POST-ACCUM clauses.

Top-levels DELETE statements are not yet supported in SYNTAX v2.
Edges can only be deleted in the ACCUM clause.
For best performance, vertices should be deleted in the POST-ACCUM clause.
To perform within-SELECT deletes, the FROM pattern can only be a single hop, fixed length pattern.

Example 3. Delete vertex Tiger Woods and its KNOWS edges.

To verify the result, you can use built-in REST calls.

Conjunctive Pattern Matching (Beta)

What is Conjunctive Pattern Matching?

So far, we have described pattern matching as one path pattern in a FROM clause. In this section, we introduce GSQL's capability to match multiple patterns in one FROM clause. This extension is called Conjunctive Pattern Matching (CPM), because the query asks for the conjunction (logical AND) of the patterns. To get a match, all of the patterns must be satisfied, and the patterns can interrelate. Visually, you can think of patterns formed by a set of intersecting line segments. This feature, introduced as a Beta feature in TigerGraph 3.0, enables you to express complex patterns concisely in a single query block.

In general, a CPM query block consists of multiple patterns in the FROM clause. It has a structure illustrated below.

# Conjunctive Pattern Matching Syntax

SelectBlock := SELECT alias 
               FROM pattern
               [sampleClause]
               [whereClause]
               [ [perClause] accumClause]
               [postAccumClause]*
                    ...
                    
pattern := vertexPattern | edgePattern | (pathPattern ["," pathPattern])
# vertexPattern and edgePattern are from classic GSQL

We elaborate on each of the clause.

SELECT Clause

The SELECT clause selects only one vertex alias from all the patterns in the FROM clause.

FROM Clause - Conjunctive Matching

This is where the conjunctive matching is expressed. The FROM clause consists of a list of path patterns, which are separated by commas. Evaluating each pattern against the underlying graph data produces a match table. If two patterns share a vertex alias, then we form the natural join of the two match tables.

For example, consider this CPM:

FROM X:x - (E1:e1) - Y:y - (E2>:e2) - Z:z,
     Z:z - (E3:e3) - U:u - (E4>:e4) - V:v

The first pattern's variables are x, e1, y, e2, and z; the second pattern's variables are z, e3, u, e4, and v. Considering the two patterns independently would yield the follwing match table schemas:

#first match table
(x, e1, y, e2, z)
#second match table
(z, e3, u, e4, v)

Natural Join of Match Tables

Natural joining two match tables compares all the shared vertex aliases between the two tables, and the resulting joined table contains all non-shared variables plus one copy of each of the shared vertex variables. Here is the match table for the CPM above:

#natural join result; the shared vertex variable z appears once.
(x, e1, y, e2, z, e3, u, e4, v)

Valid Conjunctive Patterns

The match table of the conjunctive pattern match is the natural join of all the patterns' match tables. By design, a row in the CPM match table must simultaneously satisfy all the match tables.

If the match tables of the patterns in a FROM clause can be naturally joined into one match table, then the FROM clause has a valid CPM input. Otherwise, the FROM clause has an invalid pattern input list.

For example, below we show two valid CPM inputs and one invalid CPM input.

# a valid CPM, since the two patterns natrually join on :tgt
SELECT
FROM Person:p - (KNOWS) - :tgt, 
     Post:s -(<LIKES) - :tgt 

# a valid CPM, since the two patterns naturally join on :f
SELECT
FROM Person:p - (KNOWS) - :f - (LIKES>) - Post:tgt, 
     :f - (LIKES>) - Comment:c 

# an invalid CPM, since the two patterns do not share any vertex variables.
# they cannot be naturally joined. 
# One pattern has (p, tgt); the other has (s, t). 
SELECT
FROM Person:p - (KNOWS) - :tgt, 
     Post:s - (<LIKES) - Person:t

WHERE Clause

The predicates in the WHERE clause can use any of the vertex or edge aliases in any of the patterns, including predicates which combine variables from different constituent paths. CPM queries do not have any special restrictions on the WHERE predicate. Distance matters, however, for performance. Conditions that are local, measured both cross-path and within-path, can be resolved earlier and therefore are faster.

In the example below, x2.age > x4.age is a cross-pattern predicate, e1.timestamp < e3.timestamp is a cross-pattern predicate, and x1.gender == x4.gender is a local predicate of the second pattern.

FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3,
     X1:x1-(E3:e3)-X4:x4
WHERE x2.age > x4.age AND e1.timestamp < e3.timestamp AND x1.gender == x4.gender

ACCUM Clause

You can ACCUM to any vertex variable in a CPM block.

The ACCUM clause by default will execute as many times as the row (match) count of the CPM match table; each execution uses one row from the match table.

ACCUM To The Three Vertex Variables of A CPM Pattern

#accum to x1, x2 and x4. 
FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3, 
     X1:x1-(E3:e3)-X4:x4
ACCUM x1.@cnt +=1, x2.@cnt += x3.quantity, x4.@cnt += x3.quantity

POST-ACCUM Clause

POST-ACCUM for CPM behaves the same as POST-ACCUM for single path patterns. That is, each POST-ACCUM clause can refer to one vertex alias, and the clause executes iteratively over that vertex set (e.g. one vertex column in the matching table). Its statements can access the aggregated accumulator result computed in the ACCUM clause. The query can have multiple POST-ACCUM clauses, one for each vertex alias you wish to work on. The multiple POST-ACCUM clauses are processed in parallel; it doesn't matter in what order you write them. (For each binding, the statements within a clause are executed in order.)

For example, below we have three POST-ACCUM clauses. The first one iterates throughx1, and for each x1, we do @@cnt += x1.@cnt. The second and third POST-ACCUMs iterate through x2 and x3 respectively, and accumulates their @cnt accumulator value into @@cnt.

POST-ACCUM to a global accumulator @@cnt, using three CPM Vertex Variables


FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3, 
     X1:x1-(E3:e3)-X4:x4
ACCUM x1.@cnt +=1, x2.@cnt += x3.quantity, x4.@cnt += x3.quantity
POST-ACCUM @@cnt += x1.@cnt
POST-ACCUM @@cnt += x2.@cnt
POST-ACCUm @@cnt += x3.@cnt;

Examples

Example 1. Find Viktor Akhiezer's liked messages (100+ days after their creation) whose author's last name begin with letter S. Output the message's forum.

USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT f
        FROM Person:s - (LIKES>:e1) - :msg - (HAS_CREATOR>) - Person:t, 
             Forum:f - (CONTAINER_OF>:e2) - :msg
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer" 
              AND t.lastName LIKE "S%" 
              AND e1.creationDate >DATETIME_ADD(msg.creationDate, INTERVAL 100 DAY);

  PRINT F;
}

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"F": [{
    "v_id": "962072688797",
    "attributes": {
      "id": 962072688797,
      "title": "Album 12 of Mario Santos",
      "creationDate": "2011-04-12 09:36:50"
    },
    "v_type": "Forum"
  }]}]
}

Example 2. Find any authors who wrote posts that Viktor Akhiezer's liked and whose last name begins with S. Find the country for each of these authors and report on the countries.

USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  C  =  SELECT ctry
        FROM Person:s - (LIKES>:e1) - Post:msg - (HAS_CREATOR>) - Person:t,
             :t - (WORK_AT>:e2) - Company:c, 
             :c - (IS_LOCATED_IN>) - Country:ctry
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
              AND t.lastName LIKE "S%" ;

  PRINT C;
}

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"C": [{
    "v_id": "93",
    "attributes": {
      "name": "Portugal",
      "id": 93,
      "url": "http://dbpedia.org/resource/Portugal"
    },
    "v_type": "Country"
  }]}]
}

Example 3. Given a TagClass and a Country, find all the Forums created in the given Country, containing at least one Post with Tags belonging directly to the given TagClass. The location of a Forum is identified by the location of the Forum’s moderator.

USE GRAPH ldbc_snb

DROP QUERY bi_4

CREATE QUERY bi_4(string tcName, string cName) for graph ldbc_snb syntax v2 {
  SetAccum<vertex<Post>> @postSet;
  SumAccum<int> @personId, @postCount;

  ForumSet =
    SELECT f
    FROM Forum:f -(HAS_MODERATOR>)- Person:a -(IS_LOCATED_IN>.IS_PART_OF>)- Country:c,
         :f -(CONTAINER_OF>)- Post:p -(HAS_TAG>.HAS_TYPE>)- TagClass:tc
    WHERE c.name == cName and tc.name == tcName
    ACCUM f.@personId = a.id, f.@postSet += p
    POST-ACCUM f.@postCount = f.@postSet.size(), f.@postSet.clear()
    ORDER BY f.@postCount DESC, f.id ASC
    LIMIT 3;

  PRINT ForumSet[ForumSet.id, ForumSet.title, ForumSet.creationDate,
                 ForumSet.@personId, ForumSet.@postCount];
}

INSTALL QUERY bi_4

RUN QUERY bi_4("MusicalArtist", "Burma")

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"ForumSet": [
    {
      "v_id": "81903",
      "attributes": {
        "ForumSet.title": "Wall of Donald Steele-Perkins",
        "ForumSet.@personId": 5226,
        "ForumSet.id": 81903,
        "ForumSet.@postCount": 65,
        "ForumSet.creationDate": "2010-02-15 06:48:04"
      },
      "v_type": "Forum"
    },
    {
      "v_id": "137438953686",
      "attributes": {
        "ForumSet.title": "Wall of Eric Law-Yone",
        "ForumSet.@personId": 2199023262994,
        "ForumSet.id": 137438953686,
        "ForumSet.@postCount": 65,
        "ForumSet.creationDate": "2010-04-25 22:10:32"
      },
      "v_type": "Forum"
    },
    {
      "v_id": "687194810508",
      "attributes": {
        "ForumSet.title": "Wall of Hector Hugh Michie",
        "ForumSet.@personId": 10995116283784,
        "ForumSet.id": 687194810508,
        "ForumSet.@postCount": 39,
        "ForumSet.creationDate": "2010-12-19 15:33:30"
      },
      "v_type": "Forum"
    }
  ]}]
}

Example 4. For a given country, count all the distinct triples of Persons such that:

a is a friend of b.
b is a friend of c
c is a friend of a.

Distinct means that if a certain 3 vertices appear once in the results, it will not be repeated: it will appear only once. KNOWS is an undirected relationship, so it doesn't matter in what order we list the 3 vertices.


USE GRAPH ldbc_snb

CREATE QUERY bi_17(string cName) FOR GRAPH ldbc_snb SYNTAX v2 {
  TYPEDEF TUPLE <uint a, uint b, uint c> triplet;
  SetAccum<triplet> @@tripletSet;
  SumAccum<int> @@tripletCount;

  C =
    SELECT c
    FROM Country:c -(<IS_PART_OF.<IS_LOCATED_IN)- Person:p1,
         :c -(<IS_PART_OF.<IS_LOCATED_IN)- Person:p2,
         :c -(<IS_PART_OF.<IS_LOCATED_IN)- Person:p3,
         :p1 -(KNOWS)- :p2 -(KNOWS)- :p3 -(KNOWS)- :p1
    WHERE c.name == cName AND p1.id < p2.id AND p2.id < p3.id
    ACCUM @@tripletSet += triplet(p1.id, p2.id, p3.id);

  @@tripletCount = @@tripletSet.size();
  @@tripletSet.clear();
  PRINT @@tripletCount;
}


INSTALL QUERY bi_17

RUN QUERY bi_17("Spain")

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@tripletCount": 242}]
}

More Examples. We translated LDBC-SNB BI and IC queries using CPM, and shared the translation in github. Please refer to the query translation here. Most of the queries are installed as functions, you can find sample parameter(s) of the functions from here.

Source Vertex Set Flexibility

As mentioned when we first described pattern matching, in One-hop patterns, the source (leftmost) vertex set can be a vertex type, an alternation of types, or even omitted.

Example 1. Find Viktor Akhiezer's favorite messages' creators whose last name begins with letter S. Count them.


USE GRAPH ldbc_snb

#start from a vertex type "Person"
INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT t
        FROM Person:s -(LIKES>:e1)- :msg -(HAS_CREATOR>)- Person:t
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer" 
              AND t.lastName LIKE "S%"
        POST-ACCUM @@cnt+=1;

  PRINT  @@cnt;

}
#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 8}]
}

Example 2. Same query as example 1, but without beginning with vertex types. GSQL compiler can infer the types of :s.

USE GRAPH ldbc_snb

#both end points of the pattern do not have vertex types.
INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT t
        FROM :s -(LIKES>:e1)- :msg -(HAS_CREATOR>)- :t
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer" AND t.lastName LIKE "S%"
        POST-ACCUM @@cnt+=1;

  PRINT  @@cnt;

}
#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 8}]
}

Example 3. Count the LIKES edge.

USE GRAPH ldbc_snb

# a pattern starts without any information.
INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT msg
        FROM  -(LIKES>:e1)- :msg
        ACCUM @@cnt+=1;

  PRINT  @@cnt;

}
#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 2190095}]
}

Per Clause (Beta)

Introduction

Pattern matching produces a virtual match table, and the ACCUM clause acts like a FOREACH loop, executing the clause's statement once for each row of the match table.

For example, consider the following query which counts the number of paths in a simple 2-hop pattern:

SumAccum<int> @@cnt; 

S = SELECT t 
    FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
    ACCUM @@cnt += 1;

Suppose the query produces the following match table.

(s, edge1, m , edge2, t)//match table schema
v1, e1, v3, e2, v2 //match 1 
v1, e3, v4, e4, v2 //match 2 
v5, e5, v6, e6, v7 //match 3
v8, e7, v9, e8, v7 //match 4

By default, the ACCUM clause will execute the @@cnt += 1statement 4 times, for each row in the match table. The result will be @@cnt = 4.

For the same query, what if the user wants to

count the number of distinct path endings in the match table? For this case, we would want to iterate on the alias t.
count the number of distinct (start, end) pairs in the match table? For that case, we would want to iterate on distinct pairs of the aliases (s, t).

To provide users with this added flexibility and finer control over ACCUM iteration, TigerGraph 3.0 adds the PER clause to pattern matching (V2) syntax.

Syntax

selectBlock := SELECT alias 
               FROM pattern
               [sampleClause]
               [whereClause]
               [[perClause] accumClause]
               [postAccumClause]*
               [havingClause]
               [orderClause]
               [limitClause]
                    
perClause := PER (vertex_alias_1, vertex_alias_2, ...)

Examples. Below are multiple examples of the PER Clause using the same FROM clause.

S1 = SELECT s 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (s) 
     ACCUM @@cnt += 1;
    
S2 = SELECT t 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (t)
     ACCUM @@cnt += 1;

S3 = SELECT m 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (m)
     ACCUM @@cnt += 1;
    
S4 = SELECT t 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (s, t)
     ACCUM @@cnt += 1;
    
S5 = SELECT t 
     FROM S:s - (E1:edge1) - M:m - (E2:edge2) - T:t
     PER (s, m, t)
     ACCUM @@cnt += 1;

Semantics

Suppose s, m, and t are vertex aliases in a pattern. Below are some interpretations of the PER Clause based on the graph element bindings found in the match table.

PER (s) ACCUM means that per each distinct s vertex, execute the ACCUM clause once.
PER (s,t) ACCUM means that per each distinct (s, t) pair, execute the ACCUM clause once.
PER (s,m,t) ACCUM means that per each distinct (s, m, t) tuple, execute the ACCUM clause once.

Examples to show PER clause semantics.

//match table
(s, edge1, m , edge2, t)//schema
v1, e1, v3, e2, v2 //match 1 
v1, e3, v4, e4, v2 //match 2 
v5, e5, v6, e6, v7 //match 3
v8, e7, v9, e8, v7 //match 4

//since we have v1, v5, and v8 three distinct vertices bind to s, 
//we execute ACCUM clause 3 times.  
S1 = SELECT s 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s) 
     ACCUM @@cnt += 1;
  
//since we have v2, v7 two distinct vertices bind to t, 
//we execute ACCUM clause twice.     
S2 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (t)
     ACCUM @@cnt += 1;
     
//since we have v3, v4, v6, v9 four distinct vertices bind to m, 
//we execute ACCUM clause 4 times. 
S3 = SELECT m
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (m)
     ACCUM @@cnt += 1;
     
//since we have (v1, v2), (v5, v7) and (v8, v7) three distinct vertex pairs 
//bind to (s,t), we execute ACCUM clause 3 times.     
S4 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, t)
     ACCUM @@cnt += 1;

    
//since we have (v1, v3, v2), (v1, v4, v2), (v5, v6, v7) and (v8, v9, v7) four
//distinct vertex groups bind to (s,m,t), we execute ACCUM clause 4 times.      
S5 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, m, t)
     ACCUM @@cnt += 1;

If the PER Clause is used in a SELECT query block, then the vertex aliases used in the SELECT, ACCUM , and POST-ACCUM clauses must be confined to the aliases that appear in the PER clause.

Below are some illegal cases.

//semantic error. SELECT t, but t doesn't appear in PER clause. 
S1 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, m)
     ACCUM @@cnt += 1;
     
//semantic error. ACCUM t.@cnt, but t doesn't appear in PER clause.
S2 = SELECT t 
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s, m)
     ACCUM t.@cnt += 1;
     
//semantic error. POST-ACCUM t.@cnt, but t doesn't appear in PER clause.
S3 = SELECT s
     FROM S:s - (E1:edge1) - M:m -(E2:edge2) - T:t
     PER (s)
     ACCUM s.@cnt += 1
     POST-ACCUM t.@cnt =1;

PER Clause Examples

Example 1. Count the number of Countries that has a City which has a resident that likes a post.

//Example 1. 
USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {
  SumAccum<int> @@cnt;

 R =   SELECT c
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       PER    (c)
       ACCUM  @@cnt +=1;

 PRINT @@cnt;
}

//results
Using graph 'ldbc_snb'
The query AA is dropped.
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 111}]
}

//Example 2. 
USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {
  SumAccum<int> @@cnt;

 R =   SELECT p
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       PER    (p)
       ACCUM  @@cnt +=1;

 PRINT @@cnt;

//result
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 70668}]
}

Example 3. Find for each country in ("Dominican_Republic","Angola", "Cambodia") the number of posts that is liked by a person living in that country.

//Exmaple 3
USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2{

 MapAccum<string, SumAccum<int>> @@postPerCountry;

 R =   SELECT p
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       WHERE  c.name in  ("Dominican_Republic","Angola", "Cambodia")
       PER    (c, p)
       ACCUM  @@postPerCountry += (c.name -> 1);

 PRINT @@postPerCountry;
}

//results
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@postPerCountry": {
    "Dominican_Republic": 395,
    "Angola": 12,
    "Cambodia": 4002
  }}]
}

Example 4. Find for each country in ("Dominican_Republic","Angola", "Cambodia") the number of posts that is liked by a person living in that country. Use local accumulators this time.

USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2{

 SumAccum<int> @postCnt;

 R =   SELECT c
       FROM   Country:c -(<IS_PART_OF.<IS_LOCATED_IN.LIKES>)- Post:p
       WHERE  c.name in  ("Dominican_Republic","Angola", "Cambodia")
       PER    (c, p) //per (country, post), add 1 to c.@postCnt
       ACCUM  c.@postCnt += 1;

 PRINT R;
}

//results
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"R": [
    {
      "v_id": "2",
      "attributes": {
        "@postCnt": 12,
        "name": "Angola",
        "id": 2,
        "url": "http://dbpedia.org/resource/Angola"
      },
      "v_type": "Country"
    },
    {
      "v_id": "67",
      "attributes": {
        "@postCnt": 4002,
        "name": "Cambodia",
        "id": 67,
        "url": "http://dbpedia.org/resource/Cambodia"
      },
      "v_type": "Country"
    },
    {
      "v_id": "11",
      "attributes": {
        "@postCnt": 395,
        "name": "Dominican_Republic",
        "id": 11,
        "url": "http://dbpedia.org/resource/Dominican_Republic"
      },
      "v_type": "Country"
    }
  ]}]
}

Performance and Best Practices

The PER Clause not only helps users to control the semantics of the ACCUM clause, it also boosts the performance of the pattern match query, as it uses the PER clause to optimize the query execution.

To get the best performance, we recommend three guidelines for writing efficient queries.

Use PER (target) If Possible

Per target is in general faster than Per source. In the example below, query q2 is faster than q1. The only difference between these two queries is q2's FROM pattern is the flip of q1's FROM pattern.

USE GRAPH ldbc_snb

# not recommended, since it does per (src).
CREATE QUERY q1 () SYNTAX v2 {

  SumAccum<int> @@cnt ;

  T = SELECT c
      FROM Comment:c - (<LIKES) - Person:ps - (IS_LOCATED_IN>) - City:city
      WHERE year(c.creationDate) >= 2006
      PER (c)
      ACCUM @@cnt += 1;

  PRINT @@cnt;
}

# recommended, since it does per (tgt)
CREATE QUERY q2 () SYNTAX v2 {

  SumAccum<int> @@cnt ;

  T = SELECT c
      FROM City:city - (<IS_LOCATED_IN) - Person:ps - (LIKES>) - Comment:c
      WHERE year(c.creationDate) >= 2006
      PER (c)
      ACCUM @@cnt += 1;

  PRINT @@cnt;
}

Write Patterns with Smallest Expected Vertex Set on the Left

USE GRAPH ldbc_snb

# not recommended, since the pattern starts from a large cardinality vertex type
# (Person), and ends at a small cardinality vertex type (Tag). 
CREATE QUERY q3 () SYNTAX v2 {

  SumAccum<int> @cnt;

  V = SELECT s
      FROM Person:s- (LIKES>)-Post:p - (<CONTAINER_OF)-:f - (HAS_TAG>) - :t
      PER (s)
      ACCUM s.@cnt += 1;

  PRINT V.size();
}

# recommended, start from small cardinality end (Tag), and use per tgt
CREATE QUERY q4 () SYNTAX v2 {

  SumAccum<int> @cnt;

  V = SELECT s
      FROM Tag:t-(<HAS_TAG)-Forum:f -(CONTAINER_OF>)-Post:p  - (<LIKES)- Person:s
      PER (s)
      ACCUM s.@cnt += 1;

  PRINT V.size();
}

Specify Complete Type Information

USE GRAPH ldbc_snb

#we do not put Forum befoe :f, even if we know it.
CREATE QUERY q5 () SYNTAX v2 {

  SumAccum<int> @@person_cnt;

  V = SELECT s
      FROM Person:s- (LIKES>)-Post:p - (<CONTAINER_OF)-:f - (HAS_TAG>) - :t
      PER (s)
      ACCUM @@person_cnt += 1;

  PRINT @@person_cnt;
}


#recommended: we put Forum as the type info. 
CREATE QUERY q6 () SYNTAX v2 {

  SumAccum<int> @@person_cnt;

  V = SELECT s
      FROM Person:s- (LIKES>)-Post:p - (<CONTAINER_OF)-Forum:f - (HAS_TAG>) - :t
      PER (s)
      ACCUM @@person_cnt += 1;

  PRINT @@person_cnt;
}

LDBC Benchmark Queries

Data Modification

Pattern Matching GSQL supports Insert, Update, and Delete operations. The syntax is identical to that in classic GSQL (v1), though the full range of data modification operations are not yet support.

In general, data modification can be at two levels in GSQL:

Top level. The statement does not need to within any other statement.
Within a SELECT query statement. The FROM-WHERE clauses define a match table, and the data modification is performed based on the vertex and edge information in the match table. The GSQL specifications calls these within-SELECT statements DML-sub statements.

Insert, Update, and Delete currently work in compiled mode only (e.g., you must run INSTALL QUERY before RUN QUERY.) Data Modification in interpreted mode is not yet available.
SELECT queries with data modification may only have one POST-ACCUM clause.

Insert vertices and edges

Pattern matching Insert is supported at both the top-level and within-SELECT levels, using the same syntax as in classic GSQL. You can insert vertices and edges.

For a top-level statement, use ,
Inside an ACCUM or POST-ACCUM clause, use the statement.

USE GRAPH ldbc_snb

#find Viktor's 2012 favorite posts' authors, whose lastName prefix with S.
INTERPRET QUERY() SYNTAX V2 {

  R  =  SELECT t
        FROM Person:s -(LIKES>)- :msg -(HAS_CREATOR>)- Person:t
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
         AND t.lastName LIKE "S%" AND year(msg.creationDate) == 2012;

  PRINT R[R.id, R.firstName, R.lastName];
}

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"R": [
    {
      "v_id": "8796093025410",
      "attributes": {
        "R.id": 8796093025410,
        "R.firstName": "Priyanka",
        "R.lastName": "Singh"
      },
      "v_type": "Person"
    },
    {
      "v_id": "2199023260091",
      "attributes": {
        "R.id": 2199023260091,
        "R.firstName": "Janne",
        "R.lastName": "Seppala"
      },
      "v_type": "Person"
    },
    {
      "v_id": "15393162796846",
      "attributes": {
        "R.id": 15393162796846,
        "R.firstName": "Mario",
        "R.lastName": "Santos"
      },
      "v_type": "Person"
    }
  ]}]
}

# create a Person node, whose name is tiger,
# and connect this Person with above Victor's favorite authors
CREATE QUERY InsertEdgeAndVertex () SYNTAX v2{

  #add a celebrity person node using INSERT INTO statement.
  INSERT INTO Person VALUES (100000000,"Tiger", "Woods", "m", _, _,_,_,_,_); 
  
  R  =  SELECT t
        FROM Person:s -(LIKES>)- :msg -(HAS_CREATOR>)- Person:t
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
         AND t.lastName LIKE "S%" AND year(msg.creationDate) == 2012
        PER (s, t)
        ACCUM
           #add edges connecting "tiger" and t with a 6/1/2020 time stamp
          INSERT INTO KNOWS VALUES(100000000, t, to_datetime("2020-06-01"));
        
  PRINT R [R.id, R.firstName, R.lastName];
}

INSTALL QUERY InsertEdgeAndVertex
RUN QUERY InsertEdgeAndVertex()

You can verify the result by running a simple built-in REST endpoint.

Check the inserted vertex.

Linux Shell

#check the inserted vertex
curl -X GET "http://localhost:9000/graph/ldbc_snb/vertices/Person/100000000" | jq .

#result
{
  "version": {
    "edition": "enterprise",
    "api": "v2",
    "schema": 1
  },
  "error": false,
  "message": "",
  "results": [
    {
      "v_id": "100000000",
      "v_type": "Person",
      "attributes": {
        "id": 100000000,
        "firstName": "Tiger",
        "lastName": "Woods",
        "gender": "m",
        "birthday": "1970-01-01 00:00:00",
        "creationDate": "1970-01-01 00:00:00",
        "locationIP": "",
        "browserUsed": "",
        "speaks": [],
        "email": []
      }
    }
  ]
}

Check the inserted edges.

Linux file

#check the inserted edges using tiger's id (100,000,000)
curl -X GET "http://localhost:9000/graph/ldbc_snb/edges/Person/100000000/KNOWS" | jq .
#result 
{
  "version": {
    "edition": "enterprise",
    "api": "v2",
    "schema": 0
  },
  "error": false,
  "message": "",
  "results": [
    {
      "e_type": "KNOWS",
      "directed": false,
      "from_id": "100000000",
      "from_type": "Person",
      "to_id": "8796093025410",
      "to_type": "Person",
      "attributes": {
        "creationDate": "2020-06-01 00:00:00"
      }
    },
    {
      "e_type": "KNOWS",
      "directed": false,
      "from_id": "100000000",
      "from_type": "Person",
      "to_id": "2199023260091",
      "to_type": "Person",
      "attributes": {
        "creationDate": "2020-06-01 00:00:00"
      }
    },
    {
      "e_type": "KNOWS",
      "directed": false,
      "from_id": "100000000",
      "from_type": "Person",
      "to_id": "15393162796846",
      "to_type": "Person",
      "attributes": {
        "creationDate": "2020-06-01 00:00:00"
      }
    }
  ]
}
#note you can use the vertex lookup API to verify the three connected authors. E.g
curl -X GET "http://localhost:9000/graph/ldbc_snb/vertices/Person/8796093025410" | jq .

Update data

Query-body level UPDATE statements are not yet supported in syntax v2.

USE GRAPH ldbc_snb

DROP QUERY UpdateKnowsTS

CREATE QUERY UpdateKnowsTS () SYNTAX v2 {

  # update the vertex tiger's attributes
  # creationDate and languages spoken in POST-ACCUM
  R = SELECT p
      FROM Person:p
      WHERE p.firstName == "Tiger" AND p.lastName == "Woods"
      POST-ACCUM
            # update simple base type attribute
            p.creationDate = to_datetime("2020-6-1"),
            # update  collection-type attribute
            p.speaks = ("english", "golf");

  #DML-sub level, update KNOWS edge attribute "creationDate" in ACCUM
  R  =  SELECT t
        FROM Person:s-(KNOWS:e) -:t
        WHERE s.firstName == "Tiger" and s.lastName == "Woods"
        #update the KNOWS edge time stamp
        ACCUM e.creationDate = to_datetime("2020-10-01");
}

INSTALL QUERY UpdateKnowsTS
RUN QUERY UpdateKnowsTS()

To verify the update, we can use REST calls.

Check Tiger Woods' creationDate and language he speaks.

Linux Shell

curl -X GET "http://localhost:9000/graph/ldbc_snb/vertices/Person/100000000" | jq .
#result
{
  "version": {
    "edition": "enterprise",
    "api": "v2",
    "schema": 0
  },
  "error": false,
  "message": "",
  "results": [
    {
      "v_id": "100000000",
      "v_type": "Person",
      "attributes": {
        "id": 100000000,
        "firstName": "Tiger",
        "lastName": "Woods",
        "gender": "m",
        "birthday": "1970-01-01 00:00:00",
        "creationDate": "2020-06-01 00:00:00",
        "locationIP": "",
        "browserUsed": "",
        "speaks": [
          "english",
          "golf"
        ],
        "email": []
      }
    }
  ]
}

Check KNOWS edges whose source is tiger woods.

Linux Shell

curl -X GET "http://localhost:9000/graph/ldbc_snb/edges/Person/100000000/KNOWS" | jq .

#result
{
  "version": {
    "edition": "enterprise",
    "api": "v2",
    "schema": 0
  },
  "error": false,
  "message": "",
  "results": [
    {
      "e_type": "KNOWS",
      "directed": false,
      "from_id": "100000000",
      "from_type": "Person",
      "to_id": "8796093025410",
      "to_type": "Person",
      "attributes": {
        "creationDate": "2020-10-01 00:00:00"
      }
    },
    {
      "e_type": "KNOWS",
      "directed": false,
      "from_id": "100000000",
      "from_type": "Person",
      "to_id": "2199023260091",
      "to_type": "Person",
      "attributes": {
        "creationDate": "2020-10-01 00:00:00"
      }
    },
    {
      "e_type": "KNOWS",
      "directed": false,
      "from_id": "100000000",
      "from_type": "Person",
      "to_id": "15393162796846",
      "to_type": "Person",
      "attributes": {
        "creationDate": "2020-10-01 00:00:00"
      }
    }
  ]
}

Delete vertices and edges

You can use delete () function to delete edges and vertices in ACCUM and POST-ACCUM clauses.

Top-levels DELETE statements are not yet supported in SYNTAX v2.
Edges can only be deleted in the ACCUM clause.
For best performance, vertices should be deleted in the POST-ACCUM clause.
To perform within-SELECT deletes, the FROM pattern can only be a single hop, fixed length pattern.

Example 3. Delete vertex Tiger Woods and its KNOWS edges.

USE GRAPH ldbc_snb

DROP QUERY  DeleteEdgeAndVertex

CREATE QUERY DeleteEdgeAndVertex () SYNTAX v2{

  R  =  SELECT t
        FROM Person:s -(KNOWS:e)- Person:t
        WHERE s.firstName == "Tiger" AND s.lastName == "Woods"
        ACCUM
           //delete edges
           DELETE(e)
        POST-ACCUM DELETE(s); //delete src vertex


  PRINT  R [R.id, R.firstName, R.lastName];
}

INSTALL QUERY DeleteEdgeAndVertex
RUN QUERY DeleteEdgeAndVertex()

To verify the result, you can use built-in REST calls.

curl -X GET "http://localhost:9000/graph/ldbc_snb/vertices/Person/100000000" | jq .
#vertexresults
{
  "version": {
    "edition": "enterprise",
    "api": "v2",
    "schema": 0
  },
  "error": true,
  "message": "The input vertex id '100000000' is not a valid vertex id for vertex type = Person.",
  "code": "601"
}

curl -X GET "http://localhost:9000/graph/ldbc_snb/edges/Person/100000000/KNOWS" | jq .
#edge results
{
  "version": {
    "edition": "enterprise",
    "api": "v2",
    "schema": 0
  },
  "error": true,
  "message": "The input source_vertex_id '100000000' is not a valid vertex id for vertex type = Person.",
  "code": "601"
}

Conjunctive Pattern Matching (Beta)

What is Conjunctive Pattern Matching?

In general, a CPM query block consists of multiple patterns in the FROM clause. It has a structure illustrated below.

# Conjunctive Pattern Matching Syntax

SelectBlock := SELECT alias 
               FROM pattern
               [sampleClause]
               [whereClause]
               [ [perClause] accumClause]
               [postAccumClause]*
                    ...
                    
pattern := vertexPattern | edgePattern | (pathPattern ["," pathPattern])
# vertexPattern and edgePattern are from classic GSQL

We elaborate on each of the clause.

SELECT Clause

The SELECT clause selects only one vertex alias from all the patterns in the FROM clause.

FROM Clause - Conjunctive Matching

For example, consider this CPM:

FROM X:x - (E1:e1) - Y:y - (E2>:e2) - Z:z,
     Z:z - (E3:e3) - U:u - (E4>:e4) - V:v

The first pattern's variables are x, e1, y, e2, and z; the second pattern's variables are z, e3, u, e4, and v. Considering the two patterns independently would yield the follwing match table schemas:

#first match table
(x, e1, y, e2, z)
#second match table
(z, e3, u, e4, v)

Natural Join of Match Tables

#natural join result; the shared vertex variable z appears once.
(x, e1, y, e2, z, e3, u, e4, v)

Valid Conjunctive Patterns

The match table of the conjunctive pattern match is the natural join of all the patterns' match tables. By design, a row in the CPM match table must simultaneously satisfy all the match tables.

For example, below we show two valid CPM inputs and one invalid CPM input.

# a valid CPM, since the two patterns natrually join on :tgt
SELECT
FROM Person:p - (KNOWS) - :tgt, 
     Post:s -(<LIKES) - :tgt 

# a valid CPM, since the two patterns naturally join on :f
SELECT
FROM Person:p - (KNOWS) - :f - (LIKES>) - Post:tgt, 
     :f - (LIKES>) - Comment:c 

# an invalid CPM, since the two patterns do not share any vertex variables.
# they cannot be naturally joined. 
# One pattern has (p, tgt); the other has (s, t). 
SELECT
FROM Person:p - (KNOWS) - :tgt, 
     Post:s - (<LIKES) - Person:t

WHERE Clause

FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3,
     X1:x1-(E3:e3)-X4:x4
WHERE x2.age > x4.age AND e1.timestamp < e3.timestamp AND x1.gender == x4.gender

ACCUM Clause

You can ACCUM to any vertex variable in a CPM block.

The ACCUM clause by default will execute as many times as the row (match) count of the CPM match table; each execution uses one row from the match table.

ACCUM To The Three Vertex Variables of A CPM Pattern

#accum to x1, x2 and x4. 
FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3, 
     X1:x1-(E3:e3)-X4:x4
ACCUM x1.@cnt +=1, x2.@cnt += x3.quantity, x4.@cnt += x3.quantity

POST-ACCUM Clause

POST-ACCUM to a global accumulator @@cnt, using three CPM Vertex Variables


FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3, 
     X1:x1-(E3:e3)-X4:x4
ACCUM x1.@cnt +=1, x2.@cnt += x3.quantity, x4.@cnt += x3.quantity
POST-ACCUM @@cnt += x1.@cnt
POST-ACCUM @@cnt += x2.@cnt
POST-ACCUm @@cnt += x3.@cnt;

Examples

Example 1. Find Viktor Akhiezer's liked messages (100+ days after their creation) whose author's last name begin with letter S. Output the message's forum.

USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT f
        FROM Person:s - (LIKES>:e1) - :msg - (HAS_CREATOR>) - Person:t, 
             Forum:f - (CONTAINER_OF>:e2) - :msg
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer" 
              AND t.lastName LIKE "S%" 
              AND e1.creationDate >DATETIME_ADD(msg.creationDate, INTERVAL 100 DAY);

  PRINT F;
}

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"F": [{
    "v_id": "962072688797",
    "attributes": {
      "id": 962072688797,
      "title": "Album 12 of Mario Santos",
      "creationDate": "2011-04-12 09:36:50"
    },
    "v_type": "Forum"
  }]}]
}

Example 2. Find any authors who wrote posts that Viktor Akhiezer's liked and whose last name begins with S. Find the country for each of these authors and report on the countries.

USE GRAPH ldbc_snb

INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  C  =  SELECT ctry
        FROM Person:s - (LIKES>:e1) - Post:msg - (HAS_CREATOR>) - Person:t,
             :t - (WORK_AT>:e2) - Company:c, 
             :c - (IS_LOCATED_IN>) - Country:ctry
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
              AND t.lastName LIKE "S%" ;

  PRINT C;
}

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"C": [{
    "v_id": "93",
    "attributes": {
      "name": "Portugal",
      "id": 93,
      "url": "http://dbpedia.org/resource/Portugal"
    },
    "v_type": "Country"
  }]}]
}

USE GRAPH ldbc_snb

DROP QUERY bi_4

CREATE QUERY bi_4(string tcName, string cName) for graph ldbc_snb syntax v2 {
  SetAccum<vertex<Post>> @postSet;
  SumAccum<int> @personId, @postCount;

  ForumSet =
    SELECT f
    FROM Forum:f -(HAS_MODERATOR>)- Person:a -(IS_LOCATED_IN>.IS_PART_OF>)- Country:c,
         :f -(CONTAINER_OF>)- Post:p -(HAS_TAG>.HAS_TYPE>)- TagClass:tc
    WHERE c.name == cName and tc.name == tcName
    ACCUM f.@personId = a.id, f.@postSet += p
    POST-ACCUM f.@postCount = f.@postSet.size(), f.@postSet.clear()
    ORDER BY f.@postCount DESC, f.id ASC
    LIMIT 3;

  PRINT ForumSet[ForumSet.id, ForumSet.title, ForumSet.creationDate,
                 ForumSet.@personId, ForumSet.@postCount];
}

INSTALL QUERY bi_4

RUN QUERY bi_4("MusicalArtist", "Burma")

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"ForumSet": [
    {
      "v_id": "81903",
      "attributes": {
        "ForumSet.title": "Wall of Donald Steele-Perkins",
        "ForumSet.@personId": 5226,
        "ForumSet.id": 81903,
        "ForumSet.@postCount": 65,
        "ForumSet.creationDate": "2010-02-15 06:48:04"
      },
      "v_type": "Forum"
    },
    {
      "v_id": "137438953686",
      "attributes": {
        "ForumSet.title": "Wall of Eric Law-Yone",
        "ForumSet.@personId": 2199023262994,
        "ForumSet.id": 137438953686,
        "ForumSet.@postCount": 65,
        "ForumSet.creationDate": "2010-04-25 22:10:32"
      },
      "v_type": "Forum"
    },
    {
      "v_id": "687194810508",
      "attributes": {
        "ForumSet.title": "Wall of Hector Hugh Michie",
        "ForumSet.@personId": 10995116283784,
        "ForumSet.id": 687194810508,
        "ForumSet.@postCount": 39,
        "ForumSet.creationDate": "2010-12-19 15:33:30"
      },
      "v_type": "Forum"
    }
  ]}]
}

Example 4. For a given country, count all the distinct triples of Persons such that:

a is a friend of b.
b is a friend of c
c is a friend of a.


USE GRAPH ldbc_snb

CREATE QUERY bi_17(string cName) FOR GRAPH ldbc_snb SYNTAX v2 {
  TYPEDEF TUPLE <uint a, uint b, uint c> triplet;
  SetAccum<triplet> @@tripletSet;
  SumAccum<int> @@tripletCount;

  C =
    SELECT c
    FROM Country:c -(<IS_PART_OF.<IS_LOCATED_IN)- Person:p1,
         :c -(<IS_PART_OF.<IS_LOCATED_IN)- Person:p2,
         :c -(<IS_PART_OF.<IS_LOCATED_IN)- Person:p3,
         :p1 -(KNOWS)- :p2 -(KNOWS)- :p3 -(KNOWS)- :p1
    WHERE c.name == cName AND p1.id < p2.id AND p2.id < p3.id
    ACCUM @@tripletSet += triplet(p1.id, p2.id, p3.id);

  @@tripletCount = @@tripletSet.size();
  @@tripletSet.clear();
  PRINT @@tripletCount;
}


INSTALL QUERY bi_17

RUN QUERY bi_17("Spain")

#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@tripletCount": 242}]
}

Source Vertex Set Flexibility

As mentioned when we first described pattern matching, in One-hop patterns, the source (leftmost) vertex set can be a vertex type, an alternation of types, or even omitted.

Example 1. Find Viktor Akhiezer's favorite messages' creators whose last name begins with letter S. Count them.


USE GRAPH ldbc_snb

#start from a vertex type "Person"
INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT t
        FROM Person:s -(LIKES>:e1)- :msg -(HAS_CREATOR>)- Person:t
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer" 
              AND t.lastName LIKE "S%"
        POST-ACCUM @@cnt+=1;

  PRINT  @@cnt;

}
#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 8}]
}

Example 2. Same query as example 1, but without beginning with vertex types. GSQL compiler can infer the types of :s.

USE GRAPH ldbc_snb

#both end points of the pattern do not have vertex types.
INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT t
        FROM :s -(LIKES>:e1)- :msg -(HAS_CREATOR>)- :t
        WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer" AND t.lastName LIKE "S%"
        POST-ACCUM @@cnt+=1;

  PRINT  @@cnt;

}
#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 8}]
}

Example 3. Count the LIKES edge.

USE GRAPH ldbc_snb

# a pattern starts without any information.
INTERPRET QUERY () SYNTAX v2 {

  SumAccum<int> @@cnt;

  F  =  SELECT msg
        FROM  -(LIKES>:e1)- :msg
        ACCUM @@cnt+=1;

  PRINT  @@cnt;

}
#result
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 2190095}]
}