1 of 8

GSQL 102 - Pattern Matching

Get Set

Introduction

In this tutorial, we will show you how to write and run Pattern Matching queries. Pattern Matching is available in TigerGraph 2.4+.

We assume you have finished GSQL 101. If not, please complete GSQL 101 first.

What is a Graph Pattern?

Pattern is a traversal trace on the graph schema. For repetitive traversal on the schema, we can use some regular expression to represent the repeating step(s). A pattern can be a linear trace, or a non-linear trace (tree, circle etc.). For example, imagine a simple schema consisting of a Person vertex type and a Friendship edge type. A pattern could be a trace on this simple schema,

Person - (Friendship) - Person - (Friendship) - Person

or, use *2 to denote the two consecutive Friendship edges,

Person - (Friendship*2) - Person

What is Pattern Matching?

Pattern matching is the process of finding subgraphs in a data graph that conforms to a given query pattern.

Prepare Your TigerGraph Environment

We're assuming you are running Developer Edition as the sole user with full privileges. If you are on a multiuser Enterprise Edition, consult with your DB administrator. You need to have Designer or Admin privilege on an empty graph. There are also links to download files at various points in the tutorial. Most are small, but the graph data file is 1GB when uncompressed.

First, let's check that you can access GSQL, and that your version is 2.4 or higher.

Open a Linux shell.
Type gsql as below. A GSQL shell prompt should appear as below.
Type version in GSQL shell. It should show 2.4 or higher as below. If not, please download and install the latest developer version from https://www.tigergraph.com/download/
Linux Shell
```
$ gsql 
GSQL > version
GSQL version: 2.4
```
If the GSQL shell does not launch, try resetting the system with "gadmin start". This will take some time to launch each service if they have not been started yet. If you need further help, please see the TigerGraph Knowledge Base and FAQs.
You need to start from an empty data catalog. If necessary, run "drop all" to clear the catalog first.

Cheatsheet

The following general use commands were introduced in GSQL 101.

The % prefix indicates Linux shell commands. You need TigerGraph admin privilege to run most gadmin commands.
The GSQL> prefix indicates GSQL shell commands.

Define the Schema

Data Set

We will use the LDBC Social Network Benchmark (LDBC SNB) data set. This data set models a twitter-like social forum. It comes with a data generator, which allows you to generate data at different scale factors. Scale factor 1 generates roughly 1GB raw data, scale factor 10 generates roughly 10GB raw data, etc.

Figure 1 shows the schema (from the LDBC SNB specification). It models the activities and relationships of social forum participants. For example, a forum Member can publish Posts on a Forum, and other Members of the Forum can make a Comment on the Post or on someone else's Comment. A Person's home location is a hierarchy (Continent>Country>City), and a person can be affiliated with a University or a Company. Tags can be used to classify a Forum and a Person's interests. Tags can belong to a TagClass. The relationships between entities are modeled as directed edges. For example, Person connects to Tag by the hasInterest edge. Forum connects to Person by two different edges, hasMember and hasModerator.

LDBC SNB schema uses inheritance to model certain entity type relationships:

Message is the superclass of Post and Comment.
Place is the superclass of City, Country, and Continent.
Organization is the superclass of University and Company.

We do not use the superclasses in our graph model. When there is an edge type connecting an entity to a superclass, we instead create an edge type from the entity to each of the subclasses of the superclass. For example, Message has an isLocatedIn relationship to Country. Since Message has two subclasses, Post and Comment, we create two edge types to Country:

Post_IS_LOCATED_IN_Country
Comment_IS_LOCATED_IN_Country

Schema Naming Conventions

Vertex Type

For each entity in Figure 1 (the rectangular boxes), we create a vertex type with the entity's name.

Person is a person who participates in a forum.
Forum is a place where persons discuss topics.
City, Country, and Continent are geographic locations of other entities.
Company and University are organizations related to a person's affiliation.
Comment and Post are the interaction messages created by persons in a forum.
Tag is a topic or a concept.
TagClass is a class or a category. TagClass can form a hierarchy of tags.

Edge Type

For each relationship in Figure 1, we create an edge type whose name consists of the source entity name, the edge name (all capitalized), and the target entity name. The three parts are connected by underscores.

SourceEntityName_EDGENAME_TargetEntityName

For example,

Person_KNOWS_Person: Person is the source and target entity names, and Knows is the edge name.
Person_LIKES_Comment: Person is the source entity name, Comment is the target entity name, and Likes is the edge name.

When the edge name has two or more words, we separate words by an underscore as well. For example:

Tag_HAS_TYPE_TagClass: Tag is the source entity name, TagClass is the target entity name, and hasType is the edge name (which is written as HAS_TYPE).
Forum_HAS_MODERATOR_Person: Forum is the source entity name, Person is the target entity name, and hasModerator is the edge name (which is written as HAS_MODERATOR).

GSQL Schema DDL

The GSQL script below can be downloaded from this link.

GSQL script

//clear the current catalog. 
// It may take a while since it restarts the subsystem services. 
DROP ALL

//vertex types
CREATE VERTEX Comment (PRIMARY_ID id UINT, id UINT, creationDate DATETIME, locationIP STRING, browserUsed STRING, content STRING, length UINT)
CREATE VERTEX Post (PRIMARY_ID id UINT, id UINT, imageFile STRING, creationDate DATETIME, locationIP STRING, browserUsed STRING, lang STRING, content STRING, length UINT) 
CREATE VERTEX Company (PRIMARY_ID id UINT, id UINT, name STRING, url STRING)
CREATE VERTEX University (PRIMARY_ID id UINT, id UINT, name STRING, url STRING)
CREATE VERTEX City (PRIMARY_ID id UINT, id UINT, name STRING, url STRING)
CREATE VERTEX Country (PRIMARY_ID id UINT, id UINT, name STRING, url STRING)
CREATE VERTEX Continent (PRIMARY_ID id UINT, id UINT, name STRING, url STRING)
CREATE VERTEX Forum (PRIMARY_ID id UINT, id UINT, title STRING, creationDate DATETIME)
CREATE VERTEX Person (PRIMARY_ID id UINT, id UINT, firstName STRING, lastName STRING, gender STRING, birthday DATETIME, creationDate DATETIME, locationIP STRING, browserUsed STRING, speaks set<STRING>, email set<STRING>)
CREATE VERTEX Tag (PRIMARY_ID id UINT, id UINT, name STRING, url STRING)
CREATE VERTEX TagClass (PRIMARY_ID id UINT, id UINT, name STRING, url STRING)

//edge types
CREATE DIRECTED EDGE Forum_CONTAINER_OF_Post (FROM Forum, TO Post) WITH REVERSE_EDGE="Forum_CONTAINER_OF_Post_REVERSE"
CREATE DIRECTED EDGE Comment_HAS_CREATOR_Person (FROM Comment, TO Person) WITH REVERSE_EDGE="Comment_HAS_CREATOR_Person_REVERSE"
CREATE DIRECTED EDGE Post_HAS_CREATOR_Person (FROM Post, TO Person) WITH REVERSE_EDGE="Post_HAS_CREATOR_Person_REVERSE"
CREATE DIRECTED EDGE Person_HAS_INTEREST_Tag (FROM Person, TO Tag) WITH REVERSE_EDGE="Person_HAS_INTEREST_Tag_REVERSE"
CREATE DIRECTED EDGE Forum_HAS_MEMBER_Person (FROM Forum, TO Person, joinDate DATETIME) WITH REVERSE_EDGE="Forum_HAS_MEMBER_Person_REVERSE"
CREATE DIRECTED EDGE Forum_HAS_MODERATOR_Person (FROM Forum, TO Person) WITH REVERSE_EDGE="Forum_HAS_MODERATOR_Person_REVERSE"
CREATE DIRECTED EDGE Comment_HAS_TAG_Tag (FROM Comment, TO Tag) WITH REVERSE_EDGE="Comment_HAS_TAG_Tag_REVERSE"
CREATE DIRECTED EDGE Post_HAS_TAG_Tag (FROM Post, TO Tag) WITH REVERSE_EDGE="Post_HAS_TAG_Tag_REVERSE"
CREATE DIRECTED EDGE Forum_HAS_TAG_Tag (FROM Forum, TO Tag) WITH REVERSE_EDGE="Forum_HAS_TAG_Tag_REVERSE"
CREATE DIRECTED EDGE Tag_HAS_TYPE_TagClass (FROM Tag, TO TagClass) WITH REVERSE_EDGE="Tag_HAS_TYPE_TagClass_REVERSE"
CREATE DIRECTED EDGE Company_IS_LOCATED_IN_Country (FROM Company, TO Country) WITH REVERSE_EDGE="Company_IS_LOCATED_IN_Country_REVERSE"
CREATE DIRECTED EDGE Comment_IS_LOCATED_IN_Country (FROM Comment, TO Country) WITH REVERSE_EDGE="Comment_IS_LOCATED_IN_Country_REVERSE"
CREATE DIRECTED EDGE Post_IS_LOCATED_IN_Country (FROM Post, TO Country) WITH REVERSE_EDGE="Post_IS_LOCATED_IN_Country_REVERSE"
CREATE DIRECTED EDGE Person_IS_LOCATED_IN_City (FROM Person, TO City) WITH REVERSE_EDGE="Person_IS_LOCATED_IN_City_REVERSE"
CREATE DIRECTED EDGE University_IS_LOCATED_IN_City (FROM University, TO City) WITH REVERSE_EDGE="University_IS_LOCATED_IN_City_REVERSE"
CREATE DIRECTED EDGE City_IS_PART_OF_Country (FROM City, TO Country) WITH REVERSE_EDGE="City_IS_PART_OF_Country_REVERSE"
CREATE DIRECTED EDGE Country_IS_PART_OF_Continent (FROM Country, TO Continent) WITH REVERSE_EDGE="Country_IS_PART_OF_Continent_REVERSE"
CREATE DIRECTED EDGE TagClass_IS_SUBCLASS_OF_TagClass (FROM TagClass, TO TagClass) WITH REVERSE_EDGE="TagClass_IS_SUBCLASS_OF_TagClass_REVERSE"
CREATE DIRECTED EDGE Person_KNOWS_Person (FROM Person, TO Person, creationDate DATETIME) WITH REVERSE_EDGE="Person_KNOWS_Person_REVERSE"
CREATE DIRECTED EDGE Person_LIKES_Comment (FROM Person, TO Comment, creationDate DATETIME) WITH REVERSE_EDGE="Person_LIKES_Comment_REVERSE"
CREATE DIRECTED EDGE Person_LIKES_Post (FROM Person, TO Post, creationDate DATETIME) WITH REVERSE_EDGE="Person_LIKES_Post_REVERSE"
CREATE DIRECTED EDGE Comment_REPLY_OF_Comment (FROM Comment, TO Comment) WITH REVERSE_EDGE="Comment_REPLY_OF_Comment_REVERSE"
CREATE DIRECTED EDGE Comment_REPLY_OF_Post (FROM Comment, TO Post) WITH REVERSE_EDGE="Comment_REPLY_OF_Post_REVERSE"
CREATE DIRECTED EDGE Person_STUDY_AT_University (FROM Person, TO University, classYear INT) WITH REVERSE_EDGE="Person_STUDY_AT_University_REVERSE"
CREATE DIRECTED EDGE Person_WORK_AT_Company (FROM Person, TO Company, workFrom INT) WITH REVERSE_EDGE="Person_WORK_AT_Company_REVERSE"

//LDBC SNB graph schema 
CREATE GRAPH ldbc_snb (*)

Load Data

Define the Loading Job

Below, we use GSQL loading language to define a loading job script, which encodes all the mappings from the source csv file from the LDBC SNB benchmark data generator to our schema.

You can download the below loading script from here.

GSQL Loading Script

USE GRAPH ldbc_snb
CREATE LOADING JOB load_ldbc_snb FOR GRAPH ldbc_snb {
  // define vertex source files
  DEFINE FILENAME v_comment_file;
  DEFINE FILENAME v_post_file;
  DEFINE FILENAME v_organisation_file;
  DEFINE FILENAME v_place_file;
  DEFINE FILENAME v_forum_file;
  DEFINE FILENAME v_person_file;
  DEFINE FILENAME v_tag_file;
  DEFINE FILENAME v_tagclass_file;
  
  // define edge source files
  DEFINE FILENAME forum_containerOf_post_file;
  DEFINE FILENAME comment_hasCreator_person_file;
  DEFINE FILENAME post_hasCreator_person_file;
  DEFINE FILENAME person_hasInterest_tag_file;
  DEFINE FILENAME forum_hasMember_person_file;
  DEFINE FILENAME forum_hasModerator_person_file;
  DEFINE FILENAME comment_hasTag_tag_file;
  DEFINE FILENAME post_hasTag_tag_file;
  DEFINE FILENAME forum_hasTag_tag_file;
  DEFINE FILENAME tag_hasType_tagclass_file;
  DEFINE FILENAME organisation_isLocatedIn_place_file;
  DEFINE FILENAME comment_isLocatedIn_place_file;
  DEFINE FILENAME post_isLocatedIn_place_file;
  DEFINE FILENAME person_isLocatedIn_place_file;
  DEFINE FILENAME place_isPartOf_place_file;
  DEFINE FILENAME tagclass_isSubclassOf_tagclass_file;
  DEFINE FILENAME person_knows_person_file;
  DEFINE FILENAME person_likes_comment_file;
  DEFINE FILENAME person_likes_post_file;
  DEFINE FILENAME comment_replyOf_comment_file;
  DEFINE FILENAME comment_replyOf_post_file;
  DEFINE FILENAME person_studyAt_organisation_file;
  DEFINE FILENAME person_workAt_organisation_file;

  // load vertex
  LOAD v_comment_file 
    TO VERTEX Comment VALUES ($0, $0, $1, $2, $3, $4, $5) USING header="true", separator="|";
  LOAD v_post_file
    TO VERTEX Post VALUES ($0, $0, $1, $2, $3, $4, $5, $6, $7) USING header="true", separator="|";
  LOAD v_organisation_file
    TO VERTEX Company VALUES ($0, $0, $2, $3) WHERE $1=="company",
    TO VERTEX University VALUES ($0, $0, $2, $3) WHERE $1=="university" USING header="true", separator="|";
  LOAD v_place_file
    TO VERTEX City VALUES ($0, $0, $1, $2) WHERE $3=="city",
    TO VERTEX Country VALUES ($0, $0, $1, $2) WHERE $3=="country",
    TO VERTEX Continent VALUES ($0, $0, $1, $2) WHERE $3=="continent" USING header="true", separator="|";
  LOAD v_forum_file
    TO VERTEX Forum VALUES ($0, $0, $1, $2) USING header="true", separator="|";
  LOAD v_person_file
    TO VERTEX Person VALUES ($0, $0, $1, $2, $3, $4, $5, $6, $7, SPLIT($8,";"), SPLIT($9,";")) USING header="true", separator="|";
  LOAD v_tag_file
    TO VERTEX Tag VALUES ($0, $0, $1, $2) USING header="true", separator="|";
  LOAD v_tagclass_file
    TO VERTEX TagClass VALUES ($0, $0, $1, $2) USING header="true", separator="|";

  // load edge
  LOAD forum_containerOf_post_file
    TO EDGE Forum_CONTAINER_OF_Post VALUES ($0, $1) USING header="true", separator="|";
  LOAD comment_hasCreator_person_file
    TO EDGE Comment_HAS_CREATOR_Person VALUES ($0, $1) USING header="true", separator="|";
  LOAD post_hasCreator_person_file
    TO EDGE Post_HAS_CREATOR_Person VALUES ($0, $1) USING header="true", separator="|";
  LOAD person_hasInterest_tag_file
    TO EDGE Person_HAS_INTEREST_Tag VALUES ($0, $1) USING header="true", separator="|";
  LOAD forum_hasMember_person_file
    TO EDGE Forum_HAS_MEMBER_Person VALUES ($0, $1, $2) USING header="true", separator="|";
  LOAD forum_hasModerator_person_file
    TO EDGE Forum_HAS_MODERATOR_Person VALUES ($0, $1) USING header="true", separator="|";
  LOAD comment_hasTag_tag_file
    TO EDGE Comment_HAS_TAG_Tag VALUES ($0, $1) USING header="true", separator="|";
  LOAD post_hasTag_tag_file
    TO EDGE Post_HAS_TAG_Tag VALUES ($0, $1) USING header="true", separator="|";
  LOAD forum_hasTag_tag_file
    TO EDGE Forum_HAS_TAG_Tag VALUES ($0, $1) USING header="true", separator="|";
  LOAD tag_hasType_tagclass_file
    TO EDGE Tag_HAS_TYPE_TagClass VALUES ($0, $1) USING header="true", separator="|";
  LOAD organisation_isLocatedIn_place_file
    TO EDGE Company_IS_LOCATED_IN_Country VALUES ($0, $1) WHERE to_int($1) < 111, 
    TO EDGE University_IS_LOCATED_IN_City VALUES ($0, $1) WHERE to_int($1) > 110 USING header="true", separator="|";
  LOAD comment_isLocatedIn_place_file
    TO EDGE Comment_IS_LOCATED_IN_Country VALUES ($0, $1) USING header="true", separator="|";
  LOAD post_isLocatedIn_place_file
    TO EDGE Post_IS_LOCATED_IN_Country VALUES ($0, $1) USING header="true", separator="|";
  LOAD person_isLocatedIn_place_file
    TO EDGE Person_IS_LOCATED_IN_City VALUES ($0, $1) USING header="true", separator="|";
  LOAD place_isPartOf_place_file
    TO EDGE Country_IS_PART_OF_Continent VALUES ($0, $1) WHERE to_int($0) < 111,
    TO EDGE City_IS_PART_OF_Country VALUES ($0, $1) WHERE to_int($0) > 110 USING header="true", separator="|";
  LOAD tagclass_isSubclassOf_tagclass_file
    TO EDGE TagClass_IS_SUBCLASS_OF_TagClass VALUES ($0, $1) USING header="true", separator="|";
  LOAD person_knows_person_file
    TO EDGE Person_KNOWS_Person VALUES ($0, $1, $2) USING header="true", separator="|";
  LOAD person_likes_comment_file
    TO EDGE Person_LIKES_Comment VALUES ($0, $1, $2) USING header="true", separator="|";
  LOAD person_likes_post_file
    TO EDGE Person_LIKES_Post VALUES ($0, $1, $2) USING header="true", separator="|";
  LOAD comment_replyOf_comment_file
    TO EDGE Comment_REPLY_OF_Comment VALUES ($0, $1) USING header="true", separator="|";
  LOAD comment_replyOf_post_file
    TO EDGE Comment_REPLY_OF_Post VALUES ($0, $1) USING header="true", separator="|";
  LOAD person_studyAt_organisation_file
    TO EDGE Person_STUDY_AT_University VALUES ($0, $1, $2) USING header="true", separator="|";
  LOAD person_workAt_organisation_file
    TO EDGE Person_WORK_AT_Company VALUES ($0, $1, $2) USING header="true", separator="|";
}

Prepare The Raw Data

We have generated scale-factor 1 data set (approximate 1GB). You can download it from https://s3-us-west-1.amazonaws.com/tigergraph-benchmark-dataset/LDBC/SF-1/ldbc_snb_data-sf1.tar.gz

Linux Bash

wget https://s3-us-west-1.amazonaws.com/tigergraph-benchmark-dataset/LDBC/SF-1/ldbc_snb_data-sf1.tar.gz

After downloading the raw file, you can run tar command below to decompress the downloaded file.

Linux Bash

tar -xzf  ldbc_snb_data-sf1.tar.gz

After decompressing the file, you will see a folder named "ldbc_snb_data". Enter it, you will see two subfolders

social_network
substitution_parameters

The raw data is under the social_network folder.

Run The Loading Job

Download the setup_schema.gsql file, and run the script in the shell command line to setup the schema and the loading job.

Linux Bash

gsql setup_schema.gsql

Setup the environment variable LDBC_SNB_DATA_DIR pointing to your raw file folder un-tarred in the previous section. In the example below, the raw data is in /home/tigergraph/ldbc_snb_data/social_network. Note, the folder should have the name social_network.

Linux Bash

#change the directory to your raw file directory
export LDBC_SNB_DATA_DIR=/home/tigergraph/ldbc_snb_data/social_network/

#start all TigerGraph services
gadmin start

#setup schema and loading job
gsql setup_schema.gsql

Download the loading job script and invoke it on the command line.

Linux Bash

./load_data.sh

Sample Loading Progress Output

tigergraph/gsql102$ ./load_data.sh
[Tip: Use "CTRL + C" to stop displaying the loading status update, then use "SHOW LOADING STATUS jobid" to track the loading progress again]
[Tip: Manage loading jobs with "ABORT/RESUME LOADING JOB jobid"]
Starting the following job, i.e.
  JobName: load_ldbc_snb, jobid: ldbc_snb.load_ldbc_snb.file.m1.1558053156447
  Loading log: '/mnt/data/tigergraph/logs/restpp/restpp_loader_logs/ldbc_snb/ldbc_snb.load_ldbc_snb.file.m1.1558053156447.log'

Job "ldbc_snb.load_ldbc_snb.file.m1.1558053156447" loading status
[FINISHED] m1 ( Finished: 31 / Total: 31 )
  [LOADED]
  +----------------------------------------------------------------------------------------------------------------------------------+
  |                                                                              FILENAME |   LOADED LINES |   AVG SPEED |   DURATION|
  |                       /mnt/data/download/ldbc_snb_data/social_network/comment_0_0.csv |        2052170 |    281 kl/s |     7.28 s|
  |     /mnt/data/download/ldbc_snb_data/social_network/comment_hasCreator_person_0_0.csv |        2052170 |    251 kl/s |     8.17 s|
  |            /mnt/data/download/ldbc_snb_data/social_network/comment_hasTag_tag_0_0.csv |        2698394 |    422 kl/s |     6.38 s|
  |     /mnt/data/download/ldbc_snb_data/social_network/comment_isLocatedIn_place_0_0.csv |        2052170 |    291 kl/s |     7.04 s|
  |       /mnt/data/download/ldbc_snb_data/social_network/comment_replyOf_comment_0_0.csv |        1040750 |    253 kl/s |     4.11 s|
  |          /mnt/data/download/ldbc_snb_data/social_network/comment_replyOf_post_0_0.csv |        1011421 |    248 kl/s |     4.07 s|
  |                         /mnt/data/download/ldbc_snb_data/social_network/forum_0_0.csv |          90493 |     87 kl/s |     1.03 s|
  |        /mnt/data/download/ldbc_snb_data/social_network/forum_containerOf_post_0_0.csv |        1003606 |    240 kl/s |     4.18 s|
  |        /mnt/data/download/ldbc_snb_data/social_network/forum_hasMember_person_0_0.csv |        1611870 |    431 kl/s |     3.74 s|
  |     /mnt/data/download/ldbc_snb_data/social_network/forum_hasModerator_person_0_0.csv |          90493 |     89 kl/s |     1.01 s|
  |              /mnt/data/download/ldbc_snb_data/social_network/forum_hasTag_tag_0_0.csv |         309767 |    297 kl/s |     1.04 s|
  |                  /mnt/data/download/ldbc_snb_data/social_network/organisation_0_0.csv |           7956 |      7 kl/s |     1.00 s|
  |/mnt/data/download/ldbc_snb_data/social_network/organisation_isLocatedIn_place_0_0.csv |           7956 |      7 kl/s |     1.00 s|
  |                        /mnt/data/download/ldbc_snb_data/social_network/person_0_0.csv |           9893 |      9 kl/s |     1.05 s|
  |        /mnt/data/download/ldbc_snb_data/social_network/person_hasInterest_tag_0_0.csv |         229167 |    223 kl/s |     1.03 s|
  |      /mnt/data/download/ldbc_snb_data/social_network/person_isLocatedIn_place_0_0.csv |           9893 |      9 kl/s |     1.00 s|
  |           /mnt/data/download/ldbc_snb_data/social_network/person_knows_person_0_0.csv |         180624 |    169 kl/s |     1.06 s|
  |          /mnt/data/download/ldbc_snb_data/social_network/person_likes_comment_0_0.csv |        1438419 |    449 kl/s |     3.20 s|
  |             /mnt/data/download/ldbc_snb_data/social_network/person_likes_post_0_0.csv |         751678 |    331 kl/s |     2.27 s|
  |   /mnt/data/download/ldbc_snb_data/social_network/person_studyAt_organisation_0_0.csv |           7950 |      7 kl/s |     1.00 s|
  |    /mnt/data/download/ldbc_snb_data/social_network/person_workAt_organisation_0_0.csv |          21655 |     21 kl/s |     1.00 s|
  |                         /mnt/data/download/ldbc_snb_data/social_network/place_0_0.csv |           1461 |      1 kl/s |     1.00 s|
  |          /mnt/data/download/ldbc_snb_data/social_network/place_isPartOf_place_0_0.csv |           1455 |      1 kl/s |     1.00 s|
  |                          /mnt/data/download/ldbc_snb_data/social_network/post_0_0.csv |        1003606 |    195 kl/s |     5.14 s|
  |        /mnt/data/download/ldbc_snb_data/social_network/post_hasCreator_person_0_0.csv |        1003606 |    320 kl/s |     3.13 s|
  |               /mnt/data/download/ldbc_snb_data/social_network/post_hasTag_tag_0_0.csv |         713259 |    341 kl/s |     2.09 s|
  |        /mnt/data/download/ldbc_snb_data/social_network/post_isLocatedIn_place_0_0.csv |        1003606 |    327 kl/s |     3.07 s|
  |                           /mnt/data/download/ldbc_snb_data/social_network/tag_0_0.csv |          16081 |     16 kl/s |     1.00 s|
  |          /mnt/data/download/ldbc_snb_data/social_network/tag_hasType_tagclass_0_0.csv |          16081 |     16 kl/s |     1.00 s|
  |                      /mnt/data/download/ldbc_snb_data/social_network/tagclass_0_0.csv |             72 |      71 l/s |     1.00 s|
  |/mnt/data/download/ldbc_snb_data/social_network/tagclass_isSubclassOf_tagclass_0_0.csv |             71 |      70 l/s |     1.00 s|
  +----------------------------------------------------------------------------------------------------------------------------------+

After loading, you can check the graph's size using one of the options of the administrator tool, gadmin. From a Linux shell, enter the command

gadmin status graph -v

Linux shell

gadmin status graph -v
verbose is ON
=== graph ===
[m1     ][GRAPH][MSG ] Graph was loaded (/mnt/data/tigergraph/gstore/0/part/): partition size is 437.20MiB, IDS size is 102.30MiB, SchemaVersion: 0, VertexCount: 3181724, NumOfSkippedVertices: 0, NumOfDeletedVertices: 0, EdgeCount: 34512076
[m1     ][GRAPH][INIT] True
[INFO   ][GRAPH][MSG ] Above vertex and edge counts are for internal use which show approximate topology size of the local graph partition. Use DML to get the correct graph topology information
[SUMMARY][GRAPH] graph is ready

You should see VertexCount: 3,181,724 and EdgeCount 34,512,076.

Basic Pattern Concepts

Introduction

Pattern matching by nature is declarative. It enables users to focus on specifying what they want from a query without worrying about the underlying query processing.

A pattern usually appears in the FROM clause, the most fundamental part of the query structure. The pattern specifies sets of vertex types and how they are connected by edge types. A pattern can be refined further with conditions in the WHERE clause. In this tutorial, we'll focus on the linear pattern.

Currently, pattern matching may only be used in read-only queries.

1-Hop Pattern

The easiest way to understand patterns is to start with a simple 1-Hop pattern. Even a single hop has several options. After we've tackled single hops, then we'll see how to add repetition to make variable length patterns and how to connect single hops to form bigger patterns.

In classic GSQL queries, described in GSQL 101, we used the punctuation -( )-> in the FROM clause to indicate a 1-hop query, where the arrow specifies the vertex flow from left to right, and ( ) encloses the edge types.

Person:p -(LIKES:e)-> Message:m          /* Classic GSQL example */

In pattern matching, we use the punctuation -( )- to denote a 1-hop pattern, where the edge type(s) is enclosed in the parentheses () and the hyphens - symbolize connection without specifying direction. Instead, directionality is explicitly stated for each edge type.

For an undirected edge E, no added decoration: E
For a directed edge E from left to right, use a suffix: E>
For a directed edge E from right to left, use a prefix: <E

For example, in the LDBC SNB schema, there are two directed relationships between Person and Message: person LIKES message, and message HAS_CREATOR person. Despite the fact that these relationships are in opposite directions, we can include both of them in the same pattern very concisely:

Person:p -((LIKES>|<HAS_CREATOR):e)- Message:m         /* Pattern example */

Edge Type Wildcards

The underscore _ is a wildcard meaning any edge type. Arrowheads are still used to indicate direction, e.g., _> or <_ or _ The empty parentheses () means any edge, directed or undirected.

Examples of 1-Hop Patterns

FROM X:x - (E1:e1) - Y:y
- E1 is an undirected edge. x and y bind to the end points of E1. e1 is the alias of E1.
FROM x - (E2>:e2) - Y:y
- Right directed edge, x binds to the source of E2, y binds to the target of E2.
FROM X:x - (<E3:e3) - Y:y
- Left directed edge, y binds to the source of E3, x binds to the target of E3.
FROM X:x - (_:e) - Y:y
- Any undirected edge between a member of X and a member of Y.
FROM X:x - (_>:e) - Y:y
- Any right directed edge with source in X and target in Y.
FROM X:x - (<_:e) - Y:y
- Any left directed edge with source in Y and target in X.
FROM X:x - ((<_|_):e) - Y:y
- Any left directed or any undirected. "|" means OR, and parentheses enclose the group of edge descriptors. e is the alias for the edge pattern (<_|_).
FROM X:x - ((E1|E2>|<E3):e) - Y:y
- Any one of the three edge patterns.
FROM X:x - () - Y:y
- any edge (directed or undirected)
- Same as (<_|_>|_)

How To Enter Pattern Match Syntax Mode

To use the pattern match syntax, you need to either set a session parameter or specify it in the query. There are currently two syntax versions for queries:

"v1" is the classic syntax, traversing one hop per SELECT statement. This is the default mode.
"v2" enhances the v1 syntax with pattern matching.

syntax_version Session Parameter

You can use the SET command to assign a value to the syntax_version session parameter: v1 for classic syntax; v2 for pattern matching. If the parameter is never set, the classic v1 syntax is enabled. Once set, the selection remains valid for the duration of the GSQL client session, or until it is changed with another SET command.

GSQL: Set Syntax Version By A Session Parameter

SET syntax_version="v2"

Query-Level SYNTAX option

You can also select the syntax by using the new SYNTAX option in the CREATE QUERY statement: v1 for classic syntax (default); v2 for pattern matching. The Query-Level SYNTAX option overrides the syntax_version session parameter.

CHANGE ADVISORY

The punctuation used with the SYNTAX keyword was streamlined, from CREATE QUERY <query_name><parameters> FOR GRAPH <graph_name> SYNTAX ("v2") # original version, TigerGraph 2.4.0

CREATE QUERY <query_name><parameters> FOR GRAPH <graph_name> SYNTAX v2 # final version, since TigerGraph 2.4.1

GSQL: Set Syntax Version By Specifying The Version After Graph Name In The Query

CREATE QUERY test10 (string str ) FOR GRAPH ldbc_snb SYNTAX v2
{ 
  ...
}

Running Anonymous Queries Without Installing

In this tutorial, we will use the new Interpreted Mode for GSQL, also introduced in TigerGraph 2.4. Interpreted mode lets us skip the INSTALL step, and even to run a query as soon as we create it, to offer a more interactive experience. These one-step interpreted queries are unnamed (anonymous) and parameterless, just like SQL.

To send an anonymous query to the interpret engine, replace the keyword CREATE with INTERPRET. Remember, no parameters:

INTERPRET QUERY () FOR GRAPH graph_name SYNTAX v2 { <query body> }

Recommendation: Increase the query timeout threshold.

Interpreted queries may run slower than installed queries, so we recommend increasing the query timeout threshold:

GSQL: Set Longer Timeout

# set query time out to 1 minutes
# 1 unit is 1 milli-second
SET query_timeout = 60000

Examples of 1-Hop Fixed Length Query

Example 1. Find persons who know the person named "Viktor Akhiezer" and return the top 3 oldest such persons.

Example 1. Left Directed Edge Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern. 
   friends = SELECT p
             FROM Seed:s - (<Person_KNOWS_Person:e) - Person:p
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ORDER BY p.birthday ASC
             LIMIT 3;
             
    PRINT  friends[friends.firstName, friends.lastName, friends.birthday];
}

You can copy the above GSQL script to a file named example1.gsql and invoke this script file in Linux.

Linux Bash

gsql example1.gsql

Output of Example 1

{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "developer",
    "api": "v2"
  },
  "results": [{"friends": [
    {
      "v_id": "10995116279461",
      "attributes": {
        "friends.birthday": "1980-05-13 00:00:00",
        "friends.lastName": "Cajes",
        "friends.firstName": "Gregorio"
      },
      "v_type": "Person"
    },
    {
      "v_id": "4398046517846",
      "attributes": {
        "friends.birthday": "1980-04-24 00:00:00",
        "friends.lastName": "Glosca",
        "friends.firstName": "Abdul-Malik"
      },
      "v_type": "Person"
    },
    {
      "v_id": "6597069776731",
      "attributes": {
        "friends.birthday": "1981-02-25 00:00:00",
        "friends.lastName": "Carlsson",
        "friends.firstName": "Sven"
      },
      "v_type": "Person"
    }
  ]}]
}

Example 2. Do the same as Example 1, but use a right-directed edge pattern.

Example 2. Right Directed Edge Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   friends = SELECT s
             FROM Seed:s - (Person_KNOWS_Person>:e) - Person:p
             WHERE p.firstName == "Viktor" AND p.lastName == "Akhiezer"
             ORDER BY s.birthday ASC
             LIMIT 3;

    PRINT  friends[friends.firstName, friends.lastName, friends.birthday];
}

You can copy the above GSQL script to a file named example2.gsql, and invoke this script file in Linux.

Linux Bash

gsql example2.gsql

The output should be the same as example1's output.

Example 3. Find Viktor Akhiezer's total number of comments, total number of posts, and total number of persons he knows. A Person can reach Comments, Posts and other Persons via a directed edge.

Example 3. Right Directed Any Edge Pattern.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @commentCnt= 0;
   SumAccum<int> @postCnt= 0;
   SumAccum<int> @personCnt= 0;

   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   Result = SELECT s
            FROM Seed:s - (_>:e) - :tgt
            WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
            ACCUM CASE WHEN tgt.type == "Comment" THEN
                           s.@commentCnt += 1
                       WHEN tgt.type == "Post" THEN
                           s.@postCnt += 1
                       WHEN tgt.type == "Person" THEN
                           s.@personCnt += 1
                   END;

    PRINT  Result[Result.@commentCnt, Result.@postCnt, Result.@personCnt];
}

You can copy the above GSQL script to a file named example3.gsql, and invoke this script file in Linux.

Linux Bash

gsql example3.gsql

Output of Example 3.

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"Result": [{
    "v_id": "28587302323577",
    "attributes": {
      "Result.@personCnt": 25,
      "Result.@commentCnt": 152,
      "Result.@postCnt": 96
    },
    "v_type": "Person"
  }]}]
}

Example 4. Do the same as Example 3, but use a left-directed edge pattern.

Note below (line 10) that the Seed is now {Person.*, Comment.*, Post.* }, the three types of entities that are targets of edges from a Person.

In the current version, the vertex set on the left side of the pattern must be defined in a previous statement (e.g., a seed statement), the same requirement as in v1 syntax FROM clauses. In the example below, the current version of pattern matching would not permit FROM _:s -(<:e) - Person:tgt

Example 4. Left Directed Any Edge Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @commentCnt= 0;
   SumAccum<int> @postCnt= 0;
   SumAccum<int> @personCnt= 0;

   #start with all persons, comments, and posts
   Seed = {Person.*, Comment.*, Post.*};
   #1-hop pattern.
   Result = SELECT tgt
            FROM Seed:s - (<_:e) - Person:tgt
            WHERE tgt.firstName == "Viktor" AND tgt.lastName == "Akhiezer"
            ACCUM  CASE WHEN s.type == "Comment" THEN
                          tgt.@commentCnt += 1
                         WHEN s.type == "Post" THEN
                          tgt.@postCnt += 1
                        WHEN s.type == "Person" THEN
                          tgt.@personCnt += 1
                    END;

    PRINT  Result[Result.@commentCnt, Result.@postCnt, Result.@personCnt];
}

You can copy the above GSQL script to a file named example4.gsql, and invoke this script file in linux command line. The output should be the same as in Example 3.

Example 5. Find the two oldest persons who either know "Viktor Akhiezer" or are known by "Vicktor Akhiezer". KNOWS is a directed relationship, so we need to include both directions in the pattern.

Example 5. Disjunctive 1-hop edge pattern.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   friends = SELECT p
             FROM Seed:s - ((<Person_KNOWS_Person|Person_KNOWS_Person>):e) - Person:p
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ORDER BY p.birthday ASC
             LIMIT 2;

   PRINT friends;
}

You can copy the above GSQL script to a file named example5.gsql, and invoke this script file in Linux:

Linux Bash

gsql example5.gsql

Output of Example 5.

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"friends": [
    {
      "v_id": "10995116279461",
      "attributes": {
        "birthday": "1980-05-13 00:00:00",
        "firstName": "Gregorio",
        "lastName": "Cajes",
        "gender": "male",
        "speaks": [
          "en",
          "tl"
        ],
        "browserUsed": "Firefox",
        "locationIP": "110.55.251.62",
        "id": 10995116279461,
        "creationDate": "2010-12-16 18:12:57",
        "email": ["Gregorio10995116279461@gmail.com"],
        "@multPropagAcc_1": 0
      },
      "v_type": "Person"
    },
    {
      "v_id": "4398046517846",
      "attributes": {
        "birthday": "1980-04-24 00:00:00",
        "firstName": "Abdul-Malik",
        "lastName": "Glosca",
        "gender": "male",
        "speaks": [
          "ar",
          "en"
        ],
        "browserUsed": "Chrome",
        "locationIP": "109.200.168.137",
        "id": 4398046517846,
        "creationDate": "2010-05-21 00:07:05",
        "email": [
          "Abdul-Malik4398046517846@gmail.com",
          "Abdul-Malik4398046517846@gmx.com",
          "Abdul-Malik4398046517846@land.ru"
        ],
        "@multPropagAcc_1": 0
      },
      "v_type": "Person"
    }
  ]}]
}

Example 6. Find the total comments or posts created by "Viktor Akhiezer". Again, we include two types of edges, but in this case, we count them together.

Example 6. Disjunctive 1-hop edge pattern.

USE GRAPH ldbc_snb
#pattern match syntax version is v2
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @@cnt = 0;
   Seed = {Person.*};

   friends = SELECT t
             FROM Seed:s-((<Comment_HAS_CREATOR_Person|<Post_HAS_CREATOR_Person):e1)-(Comment|Post):t
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ACCUM  @@cnt += 1 ;

  PRINT @@cnt;
}

You can copy the above GSQL script to a file named example6.gsql, and invoke this script file in Linux:

Linux Bash

gsql example6.gsql

Output of Example 6.

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 89}]
}

Repeating a 1-Hop Pattern

A common pattern is the two-step "Friend of a Friend". Or, how many entities might receive a message if it is passed up to three times? Do you have any known change of connections to a celebrity?

GSQL pattern matching makes it easy to express such variable-length patterns which repeat a single-hop. Everything else stays the same as introduced in the previous section, except we append an asterisk (or Kleene star for you regular expressionists) and an optional min..max range to an edge pattern.

(E*) means edge type E repeats any number of times (including zero!)
(E*1..3) means edge type E occurs one to three times.

Below are more illustrative examples:

1-hop star pattern — repetition of an edge pattern 0 or more times
1. FROM X:x - (E1*) - Y:y
2. FROM X:x - (E2>*) - Y:y
3. FROM X:x - (<E3*) - Y:y
4. FROM X:x - (_*) - Y:y
  - Any undirected edge can be chosen at each repetition.
5. FROM X:x - (_>*) - Y:y
  - Any right-directed edge can be chosen at each repetition.
6. FROM X:x - (<_*) - Y:y
  - Any left-directed edge can be chosen at each repetition.
7. FROM X:x - ((E1|E2>|<E3)*) - Y:y
  - Either E1, E2> or <E3 can be chosen at each repetition.
1-hop star pattern with bounds
1. FROM X:x - (E1*2..) - Y:y
  - Lower bounds only. There is a chain of at least 2 E1 edges.
2. FROM X:x - (E2>*..3) - Y:y
  - Upper bounds only. There is a chain of between 0 and 3 E2 edges.
3. FROM X:x - (<E3*3..5) - Y:y
  - Both Lower and Upper bounds. There is a chain of 3 to 5 E3 edges.
4. FROM X:x - ((E1|E2>|<E3)*3) - Y:y
  - Exact bound. There is a chain of exactly 3 edges, where each edge is either E1, E2>, or <E3.

Remarks

No alias allowed for edge with Kleene star An edge alias may not be used when a Kleene star is used. The reason is that when there are a variable number of edges, we cannot associate or bind the alias to a specific edge in the pattern.
Shortest path semantics When an edge is repeated with a Kleene star, only the shortest matching occurrences are selected. See the example below:

In Figure 2, for Pattern 1 - (E>*) - 4, any of the following paths reach 4 from 1.

1->2->3->4
1->2->3->5->6->2->3->4
any path that goes through the cycle 2->3->5->6->2 two or more times and jumps out at 3.

The first path is shorter than the rest; it is considered the only match.

Examples of Variable Hop Queries

In this tutorial, we will use the new Interpreted Mode for GSQL, introduced in TigerGraph 2.4. Interpreted mode lets us skip the INSTALL step, and even to run a query as soon as we create it, to offer a more interactive experience. These one-step interpreted queries are unnamed (anonymous) and parameterless, just like SQL.

Example 1. Find the direct or indirect superclass (including the self class) of the TagClass whose name is "TennisPlayer".

Example 1. Directed Edge Pattern Unconstrained Repetition

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*)-TagClass:t
               WHERE s.name == "TennisPlayer";

    PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example1.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example1.gsql

Note below that the starting vertex s, whose name is TennisPlayer, is also a match, using a path with zero hops.

Output of Example 1

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "0",
      "attributes": {
        "name": "Thing",
        "id": 0,
        "url": "http://www.w3.org/2002/07/owl#Thing"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "59",
      "attributes": {
        "name": "TennisPlayer",
        "id": 59,
        "url": "http://dbpedia.org/ontology/TennisPlayer"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "239",
      "attributes": {
        "name": "Agent",
        "id": 239,
        "url": "http://dbpedia.org/ontology/Agent"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 2. Find the immediate superclass of the TagClass whose name is "TennisPlayer". (This is equivalent to a 1-hop non-repeating pattern.)

Exmaple 2. Exactly 1 Repetition of A Directed Edge

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*1)-TagClass:t
               WHERE s.name == "TennisPlayer";

    PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example2.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example2.gsql

Output of Example 2

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [{
    "v_id": "149",
    "attributes": {
      "name": "Athlete",
      "id": 149,
      "url": "http://dbpedia.org/ontology/Athlete"
    },
    "v_type": "TagClass"
  }]}]
}

Example 3. Find the 1 to 2 hops direct and indirect superclasses of the TagClass whose name is "TennisPlayer".

Example 3. 1 to 2 Repetition Of A Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*1..2)-TagClass:t
               WHERE s.name == "TennisPlayer";

  PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example3.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example3.gsql

Output of Example 3

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 4. Find the superclasses within 2 hops of the TagClass whose name is "TennisPlayer".

Example 4. Up-to 2 Repetition Of A Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*..2)-TagClass:t
               WHERE s.name == "TennisPlayer";

  PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example4.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example4.gsql

Output of Example 4

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "59",
      "attributes": {
        "name": "TennisPlayer",
        "id": 59,
        "url": "http://dbpedia.org/ontology/TennisPlayer"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 5. Find the superclasses at least one hop from the TagClass whose name is "TennisPlayer".

Example 5. At Least 1 Repetition Of A Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =SELECT t
             FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*1..)-TagClass:t
             WHERE s.name == "TennisPlayer";

  PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example5.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example5.gsql

Output of Example 5

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "0",
      "attributes": {
        "name": "Thing",
        "id": 0,
        "url": "http://www.w3.org/2002/07/owl#Thing"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "239",
      "attributes": {
        "name": "Agent",
        "id": 239,
        "url": "http://dbpedia.org/ontology/Agent"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 6. Find the 3 most recent comments that are liked or created by Viktor Akhiezer, and the total number of comments related to (created or liked by) Viktor Akhiezer.

Example 6. Disjunctive 1-Repetition Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
  SumAccum<int> @@commentCnt = 0;
   
  #start with all persons.
  Seed = {Person.*};

  # find top 3 latest comments that is liked or created by Viktor Akhiezer
  # and the total number of comments related to Viktor Akhiezer
  Top3Comments = SELECT p
    FROM Seed:s - ((<Comment_HAS_CREATOR_Person|Person_LIKES_Comment>)*1) - Comment:p
    WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
    ACCUM @@commentCnt += 1
    ORDER BY p.creationDate DESC
    LIMIT 3;

  PRINT Top3Comments;
  # total number of comments related to Viktor Akhiezer
  PRINT  @@commentCnt;
}

You can copy the above GSQL script to a file named example6.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example6.gsql

Output of Example 6

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [
    {"Top3Comments": [
      {
        "v_id": "2061584720640",
        "attributes": {
          "browserUsed": "Chrome",
          "length": 4,
          "locationIP": "194.62.64.117",
          "id": 2061584720640,
          "creationDate": "2012-09-06 06:46:31",
          "content": "fine"
        },
        "v_type": "Comment"
      },
      {
        "v_id": "2061586872389",
        "attributes": {
          "browserUsed": "Chrome",
          "length": 90,
          "locationIP": "31.216.177.175",
          "id": 2061586872389,
          "creationDate": "2012-08-28 14:54:46",
          "content": "About Hector Berlioz, his compositions Symphonie fantastique and GraAbout Who Knew, the gu"
        },
        "v_type": "Comment"
      },
      {
        "v_id": "2061590804929",
        "attributes": {
          "browserUsed": "Chrome",
          "length": 83,
          "locationIP": "194.62.64.117",
          "id": 2061590804929,
          "creationDate": "2012-09-04 16:16:56",
          "content": "About Muttiah Muralitharan, mit by nine degrees, five degrees being thAbout Steve M"
        },
        "v_type": "Comment"
      }
    ]},
    {"@@commentCnt": 152}
  ]
}

Multiple Hop Patterns

Multiple Hop Pattern Shortest Path Semantics

Repeating the same hop is useful sometimes, but the real power of pattern matching comes from expressing multi-hop patterns, with specific characteristics for each hop. For example, the well-known product recommendation phrase "People who bought this product also bought this other product", is expressed by the following 2-hop pattern:

FROM This_Product:p -(<Bought:b1)- Customer:c -(Bought>:b2)- Product:p2
WHERE p2 != p

As you see, a 2-hop pattern is a simple concatenation and merging of two 1-hop patterns where the two patterns share a common endpoint. Below, Y:y is the connecting end point.

2-hop pattern

FROM X:x - (E1:e1) - Y:y - (E2>:e2) - Z:z

Similarly, a 3-hop pattern concatenates three 1-hop patterns in sequence, each pair of adjacent hops sharing one end point. Below, Y:y and Z:z are the connecting end points.

3-hop pattern

FROM X:x - (E2>:e2) -Y:y - (<E3:e3) - Z:z - (E4:e4) - U:u

In general, we can connect n 1-hop patterns into a n-hop pattern. The database will search the graph topology to find subgraphs that match this n-hop pattern.

Unnamed Intermediate Vertex Set

A multi-hop pattern has two endpoint vertex sets and one or more intermediate vertex sets. If the query does not need to express any conditions for an intermediate vertex set, then the vertex set can be omitted and the two surrounding edge sets can be joined with a simple "." For example, in the 2-hop pattern example above, if we did not need to specify that the intermediate vertex type is Y, nor need to refer to that vertex set in any of the query's other clauses (such as WHERE or ACCUM), then the pattern can be reduced as follows:

FROM X:x - (E1:e1.E2>:e2) - Z:z

Shortest paths Only for Variable Length Patterns

If a pattern has a Kleene star to repeat an edge, GSQL pattern matching selects only the shortest paths which match the pattern. If we did not apply this restriction, computer science theory tells us that the computation time could be unbounded or extreme (NP, to be technical). If we instead matched ALL paths regardless of length when a Kleene star is used without an upper bound, there could be an infinite number of matches, if there are loops in the graph. Even without loops or with an upper bound, the number of paths to check grows exponentially with the number of hops.

For the pattern 1 - (_*) - 5 in Figure 3 above, you can see the following:

There are TWO shortest paths: 1-2-3-4-5 and 1-2-6-4-5
- These have 4 hops, so we can stop searching after 4 hops. This makes the task tractable.
If we search for ALL paths which do not repeat any vertices:
- There are THREE non-repeated-vertex paths: 1-2-3-4-5, 1-2-6-4-5, and 1-2-9-10-11-12-4-5
- The actual number of matches is small, but number of paths to consider is NP.
If we search for ALL paths which do not repeat any edges:
- There are FOUR non-repeated-edge paths: 1-2-3-4-5, 1-2-6-4-5, 1-2-9-10-11-12-4-5, and 1-2-3-7-8-3-4-5
- The actual number of matches is small, but number of paths to consider is NP.
If we search for ALL paths with no restrictions:
- There are infinite matches, because we can go around the 3-7-8-3 cycle any number of times.

Additional Details about Pattern Matching

Each vertex set or edge set in a pattern (except edges with Kleene stars) can have an alias variable associated with it. When the query runs and finds matches, it associates, or binds, each alias to the matching vertices or edges in the graph.

TigerGraph 2.4 has certain restrictions on how accumulators and aliases can be used. Some or all of these restrictions will be lifted in future releases. We elaborate on them below.

SELECT Clause

The SELECT clause specifies the output vertex set of a SELECT statement. For a multiple-hop pattern, we can only select one of the two endpoints of the pattern. None of the intermediate aliases can be selected. The example below shows the two possible choices for the given pattern:

SELECT Clause Can Select End Points Only

#select starting end point x
SELECT x
FROM X:x-(E2>:e2)-Y:y-(<E3:e3)-Z:z-(E4:e4)-U:u;

#select ending end point u
SELECT u
FROM X:x-(E2>:e2)-Y:y-(<E3:e3)-Z:z-(E4:e4)-U:u;

FROM Clause

For a multiple-hop pattern, if you don't need to refer to the intermediate vertex points, you can just use "." to connect the edge patterns, giving a more succinct representation. For example, below we remove y and z, and connect E2, <E3 and E4 using the period symbol. Note that you cannot have an alias for a multi-hop sequence like E2>.<E3.E4.

Omitting

#select starting end point x
SELECT x
FROM X:x-(E2>:e2)-Y:y-(<E3:e3)-Z:z-(E4:e4)-U:u;

#if we don't need to access y, z, we can write
SELECT u
FROM X:x-(E2>.<E3.E4)-U:u;

WHERE Clause

In a multiple-hop pattern, the WHERE clause is a conjunction of local hop predicates (conditions), with the exception that the starting end point can appear in the last hop local predicate.

Consider a pattern

X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3-(E3:e3)-X4:x4

Local hop predicate means a predicate can refer to a single hop’s alias (x_i, e_i, x_(i+1)) only.
Last hop local predicate can refer to the pattern's starting end point. I.e. (x1, x3, e3, x4).
A WHERE clause is a conjunction (AND) of local hop predicates.

WHERE Clause Support "AND" of Local Predicate

# (x, e2, y) belongs to the 1-hop
# (y, e3, z) belongs to the 2-hop
# (x, z, e4, u) is the last hop local predicate
SELECT x
FROM X:x-(E2>:e2)-Y:y-(<E3:e3)-Z:z-(E4:e4)-U:u
WHERE x.age > y.age AND y.name != z.name AND (x.salary + z.salary) < u.salary

Kleene Star breaks local hop predicate. When a local hop's edge has kleene star, we cannot compose a local predicate using the local alias'.

Kleene Star Break Local Predicate

# (x,y) belongs to the 1-hop 
# which has *, then semantic error will be given
SELECT x
FROM X:x-(E2>*:e2)-Y:y-(<E3:e3)-Z:z-(E4:e4)-U:u
WHERE x.age > y.age AND y.name != z.name AND (x.salary + z.salary) < u.salary

ACCUM and POST-ACCUM Clauses

In the current version of GSQL, only certain parts of a pattern (and their corresponding aliases) are available in ACCUM and POST-ACCUM clauses. Refer to the example pattern and its highlighted parts in Figure 4 below.

Accumulators can only be attached to the pattern's endpoints (x and u in the figure). The accumulation statements may only access data from either the left endpoint (x) or the rightmost hop (z, e4, and u). Below is an example of a valid ACCUM clause.
In the POST-ACCUM clause, only the pattern's endpoints can be accessed.
For queries in Distributed mode, accumulators may only be on the right endpoint (x in the figure).

The example below shows a valid ACCUM clause:

ACCUM To The Two End Points of A Pattern

# (z,e4, u) belongs to the last-hop aliass 
# (x, u) are the end points of the pattern
SELECT x
FROM X:x-(E2>*:e2)-Y:y-(<E3:e3)-Z:z-(E4:e4)-U:u
ACCUM x.@cnt += z.id, u.@cnt += e4.id

Examples of Multiple Hop Pattern Match

Example 1. Find the 3rd superclass of the Tag class whose name is "TennisPlayer".

Example1. Succict Representation Of Multiple-hop Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
    FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>.TagClass_IS_SUBCLASS_OF_TagClass>.TagClass_IS_SUBCLASS_OF_TagClass>)-TagClass:t
    WHERE s.name == "TennisPlayer";

  PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example1.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example1.gsql

Output of Example 1

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [{
    "v_id": "239",
    "attributes": {
      "name": "Agent",
      "id": 239,
      "url": "http://dbpedia.org/ontology/Agent"
    },
    "v_type": "TagClass"
  }]}]
}

Example 2. Find in which continents were the 3 most recent messages in Jan 2011 created.

Example1. Disjunction In A Succict Representation Of Multiple-hop Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  SumAccum<String> @continentName;

  msg = {Comment.*, Post.*};

  accMsgContinent =
    SELECT s
    FROM msg:s-((Post_IS_LOCATED_IN_Country>|Comment_IS_LOCATED_IN_Country>).Country_IS_PART_OF_Continent>)-Continent:t
    WHERE  year(s.creationDate) == 2011 AND month(s.creationDate) == 1
    ACCUM s.@continentName = t.name
    ORDER BY s.creationDate DESC
    LIMIT 3;

  PRINT  accMsgContinent;
}

You can copy the above GSQL script to a file named example2.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example2.gsql

Output of Example 2

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"accMsgContinent": [
    {
      "v_id": "824640012997",
      "attributes": {
        "browserUsed": "Firefox",
        "length": 7,
        "locationIP": "27.112.21.246",
        "@continentName": "Asia",
        "id": 824640012997,
        "creationDate": "2011-01-31 23:54:28",
        "content": "no way!"
      },
      "v_type": "Comment"
    },
    {
      "v_id": "824636727408",
      "attributes": {
        "browserUsed": "Firefox",
        "length": 3,
        "locationIP": "31.2.225.17",
        "@continentName": "Europe",
        "id": 824636727408,
        "creationDate": "2011-01-31 23:57:46",
        "content": "thx"
      },
      "v_type": "Comment"
    },
    {
      "v_id": "824634837528",
      "attributes": {
        "imageFile": "",
        "browserUsed": "Internet Explorer",
        "length": 115,
        "locationIP": "87.251.6.121",
        "@continentName": "Asia",
        "id": 824634837528,
        "creationDate": "2011-01-31 23:58:03",
        "lang": "tk",
        "content": "About Adolf Hitler, iews. His writings and methods were often adapted to need and circumstance, although there were"
      },
      "v_type": "Post"
    }
  ]}]
}

Example 3. Find Viktor Akhiezer's favorite author of 2012 whose last name begins with character 'S', and how many LIKES Viktor give to the author's post or comment.

Example 3. Multiple-hop Pattern With Accumulator Applied To All Matched Paths

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
  SumAccum<int> @likesCnt;
  PersonAll = {Person.*};

  FavoriteAuthors =  SELECT t
               FROM PersonAll:s-((Person_LIKES_Comment>|Person_LIKES_Post>)) -:msg - ((Comment_HAS_CREATOR_Person>|Post_HAS_CREATOR_Person>))-Person:t
               WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer" AND t.lastName LIKE "S%" AND year(msg.creationDate) == 2012
               ACCUM t.@likesCnt +=1;

  PRINT  FavoriteAuthors[FavoriteAuthors.firstName, FavoriteAuthors.lastName, FavoriteAuthors.@likesCnt];
}

You can copy the above GSQL script to a file named example3.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example3.gsql

Output of Example 3

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"FavoriteAuthors": [
    {
      "v_id": "8796093025410",
      "attributes": {
        "FavoriteAuthors.firstName": "Priyanka",
        "FavoriteAuthors.lastName": "Singh",
        "FavoriteAuthors.@likesCnt": 1
      },
      "v_type": "Person"
    },
    {
      "v_id": "2199023260091",
      "attributes": {
        "FavoriteAuthors.firstName": "Janne",
        "FavoriteAuthors.lastName": "Seppala",
        "FavoriteAuthors.@likesCnt": 1
      },
      "v_type": "Person"
    },
    {
      "v_id": "15393162796846",
      "attributes": {
        "FavoriteAuthors.firstName": "Mario",
        "FavoriteAuthors.lastName": "Santos",
        "FavoriteAuthors.@likesCnt": 1
      },
      "v_type": "Person"
    }
  ]}]
}

Application And Benchmark Queries

We have demonstrated the basic pattern match syntax. You should have mastered the basics by this point. The next step is to see more examples and practice more.

A Recommendation Application

In this example, we want to recommend some messages (comments or posts) to the person Viktor Akhiezer.

How do we do this?

One way is to find Others who likes the same messages Viktor likes. And then recommend the messages that Others like but Viktor have not seen. The pattern is roughly like below

Viktor - (Likes>) - Message - (<Likes) - Others
Others - (Likes>) - NewMessage
Recommend NewMessage to Viktor

However, this is too fine granularity, and we are overfitting the message level data with a collaborative filtering algorithm. The intuition is that two persons are similar to each other when their "liked" messages fall into the same category (tag). This makes more sense and common than finding two persons that "likes" the same set of messages. As a result, one way to avoid this overfitting is to go one level above. That is, instead of finding common messages as a similarity base, we find common messages' tags as a similarity base. Person A and Person B are similar if they like messages that belong to the same tag. This scheme fixes the overfitting problem. In pattern match vocabulary, we have

Viktor - (Likes>) - Message - (Has>) - Tag - (<Has) - Message - (<Likes) - Others
Others - (Likes>) - NewMessage
Recommend NewMessage to Viktor

GSQL. RecommendMessage Application.

This time, we create the query first, and interpret the query by calling the query name with parameters. If we are satisfied with this query, we can use "install query queryName" to get the compiled query installed which has the best performance.

GSQL Recommendation Algorithm

use graph ldbc_snb
set syntax_version="v2"
set query_timeout=60000

DROP QUERY RecommendMessage

CREATE QUERY RecommendMessage (String fn, String ln) FOR GRAPH ldbc_snb {

  SumAccum<int> @TagInCommon;
  SumAccum<float> @SimilarityScore;
  SumAccum<float> @Rank;
  OrAccum @Liked = false;

  Seed = {Person.*};

  #mark messages liked by Viktor
    MessageLiked =
       SELECT msg
       FROM Seed:s-((Person_LIKES_Comment>|Person_LIKES_Post>))-:msg
       WHERE s.firstName == fn AND s.lastName == ln
       ACCUM msg.@Liked = true;

   #calculate log similarity score for each persons share the same interests at Tag level.
    Others   =
       SELECT p
       FROM Seed:s-((Person_LIKES_Comment>|Person_LIKES_Post>).(Comment_HAS_TAG_Tag>|Post_HAS_TAG_Tag>))- Tag:tg
        - ((<Comment_HAS_TAG_Tag|<Post_HAS_TAG_Tag).(<Person_LIKES_Comment|<Person_LIKES_Post))- :p
       WHERE s.firstName == fn AND s.lastName == ln
       ACCUM p.@TagInCommon +=1
       POST-ACCUM p.@SimilarityScore = log (1 + p.@TagInCommon);

   #recommend new messages to Viktor that have not liked by him.
    RecommendedMessage =
             SELECT msg
             FROM Others:o-((Person_LIKES_Comment>|Person_LIKES_Post>)) - :msg
             WHERE  msg.@Liked == false
             ACCUM msg.@Rank +=o.@SimilarityScore
             ORDER BY msg.@Rank DESC
             LIMIT 2;

  PRINT   RecommendedMessage[RecommendedMessage.content, RecommendedMessage.@Rank];
}


INTERPRET QUERY RecommendMessage ("Viktor", "Akhiezer")
#try the second person with just parameter change.
INTERPRET QUERY RecommendMessage ("Adriaan", "Jong")

You can copy the above GSQL script to a file named app1.gsql, and invoke this script file in linux command line.

Linux Bash

gsql app1.gsql

Output of App1

Using graph 'ldbc_snb'
The query RecommendMessage is dropped.
The query RecommendMessage has been added!
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"RecommendedMessage": [
    {
      "v_id": "549760294602",
      "attributes": {
        "RecommendedMessage.@Rank": 4855.49219,
        "RecommendedMessage.content": "About Indira Gandhi, Gandhi established closer relatAbout Mick Jagger, eer of the band. In 1989, he waAbout Ho Chi Minh, ce Unit and ECA International, About Ottoman Empire,  After t"
      },
      "v_type": "Post"
    },
    {
      "v_id": "549760292109",
      "attributes": {
        "RecommendedMessage.@Rank": 4828.7251,
        "RecommendedMessage.content": "About Ho Chi Minh, nam, as an anti-communist state, fought against the communisAbout Shiny Happy People, sale in the U."
      },
      "v_type": "Post"
    }
  ]}]
}

Install the query

When you are satisfied with your query in the GSQL interpret mode, you can now install it as a generic service which has a much faster speed. Since we have been using "CREATE QUERY .." syntax, the query is added into the catalog, we can set the syntax version and install it.

GSQL Prepare Install Query

#before install the query, need to set the syntax version 
SET syntax_version="v2"
USE GRAPH ldbc_snb

#install query 
INSTALL QUERY RecommendMessage

GSQL Run the Installed Query


GSQL > install query RecommendMessage
Start installing queries, about 1 minute ...
RecommendMessage query: curl -X GET 'http://127.0.0.1:9000/query/ldbc_snb/RecommendMessage?fn=VALUE&ln=VALUE'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.

[========================================================================================================] 100% (1/1)
GSQL > run query RecommendMessage("Viktor", "Akhiezer")
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"RecommendedMessage": [
  {
    "v_id": "549760294602",
    "attributes": {
        "RecommendedMessage.@Rank": 4855.49219,
        "RecommendedMessage.content": "About Indira Gandhi, Gandhi established closer relatAbout Mick Jagger, eer of the band. In 1989, he waAbout Ho Chi Minh, ce Unit and ECA International, About Ottoman Empire,  After t"
    },
      "v_type": "Post"
  },
  {
    "v_id": "549760292109",
    "attributes": {
        "RecommendedMessage.@Rank": 4828.7251,
        "RecommendedMessage.content": "About Ho Chi Minh, nam, as an anti-communist state, fought against the communisAbout Shiny Happy People, sale in the U."
    },
      "v_type": "Post"
  }
  ]}]
}

Linux Bash: Shutdown The System

#when you are not using the TigerGraph System on your laptop,
# to save resource, you can stop it by
gadmin stop
#when you need to start it again, use 
gadmin start

The above use log-cosine as a similarity measurement. We can also use cosine similarity by using two persons liked messages.

GSQL Recommendation Algorithm 2

USE GRAPH ldbc_snb
SET syntax_version="v2"
SET query_timeout=60000

DROP QUERY RecommendMessage

CREATE QUERY RecommendMessage (String fn, String ln) FOR GRAPH ldbc_snb {

  SumAccum<int> @MsgInCommon = 0;
  SumAccum<int> @MsgCnt = 0 ;
  SumAccum<int> @@InputPersonMsgCnt = 0;
  SumAccum<float> @SimilarityScore;
  SumAccum<float> @Rank;
  SumAccum<float> @TagCnt = 0;
  OrAccum @Liked = false;
  float sqrtOfInputPersonMsgCnt;

  Seed = {Person.*};

   #mark messages liked by input user
    InputPerson =
       SELECT s
       FROM Seed:s-((Person_LIKES_Comment>|Person_LIKES_Post>))-:msg
       WHERE s.firstName == fn AND s.lastName == ln
       ACCUM msg.@Liked = true, @@InputPersonMsgCnt += 1;

    sqrtOfInputPersonMsgCnt = sqrt(@@InputPersonMsgCnt);

   #find common msg between input user and other persons
    Others   =
       SELECT p
       FROM InputPerson:s-((Person_LIKES_Comment>|Person_LIKES_Post>))-:msg
                        -((<Person_LIKES_Comment|<Person_LIKES_Post))-:p
       ACCUM p.@MsgInCommon += 1;

    #calculate cosine similarity score.
    #|AxB|/(sqrt(Sum(A_i^2)) * sqrt(Sum(B_i^2)))
    Others  =
        SELECT o
        FROM Others:o-((Person_LIKES_Comment>|Person_LIKES_Post>))-:msg
        ACCUM o.@MsgCnt += 1
        POST-ACCUM o.@SimilarityScore = o.@MsgInCommon/(sqrtOfInputPersonMsgCnt * sqrt(o.@MsgCnt));

   #recommend new messages to input user that have not liked by him.
    RecommendedMessage =
             SELECT msg
             FROM Others:o-((Person_LIKES_Comment>|Person_LIKES_Post>)) - :msg
             WHERE  msg.@Liked == false
             ACCUM msg.@Rank +=o.@SimilarityScore
             ORDER BY msg.@Rank DESC
             LIMIT 2;

  PRINT   RecommendedMessage[RecommendedMessage.content, RecommendedMessage.@Rank];
}

INTERPRET query RecommendMessage("Viktor", "Akhiezer")

LDBC SNB Benchmark Queries

We have made available all LDBC SNB benchmark queries translated into GSQL pattern matching syntax. The benchmark queries are described in Sections 4 and 5 of the LDBC Social Network Benchmark Reference.

You can find our GSQL pattern match translations of these queries on Github: https://github.com/tigergraph/ecosys/tree/master/tools/ldbc_benchmark/tigergraph/queries_pattern_match

Note we use CREATE/INSTALL/RUN QUERY instead of INTERPRET QUERY so that these queries can be optimized and installed as REST services. There are three sets of queries:

Also, you may want to use the GraphStudio UI to help you visualize and explore the graph and to try your hand at writing your own queries.

Last but not least, join GSQL community forum, discuss and get help from fellow GSQL users and GSQL developers.

Basic Pattern Concepts

Introduction

Pattern matching by nature is declarative. It enables users to focus on specifying what they want from a query without worrying about the underlying query processing.

Currently, pattern matching may only be used in read-only queries.

1-Hop Pattern

Person:p -(LIKES:e)-> Message:m          /* Classic GSQL example */

For an undirected edge E, no added decoration: E
For a directed edge E from left to right, use a suffix: E>
For a directed edge E from right to left, use a prefix: <E

Person:p -((LIKES>|<HAS_CREATOR):e)- Message:m         /* Pattern example */

Edge Type Wildcards

The underscore _ is a wildcard meaning any edge type. Arrowheads are still used to indicate direction, e.g., _> or <_ or _ The empty parentheses () means any edge, directed or undirected.

Examples of 1-Hop Patterns

FROM X:x - (E1:e1) - Y:y
- E1 is an undirected edge. x and y bind to the end points of E1. e1 is the alias of E1.
FROM x - (E2>:e2) - Y:y
- Right directed edge, x binds to the source of E2, y binds to the target of E2.
FROM X:x - (<E3:e3) - Y:y
- Left directed edge, y binds to the source of E3, x binds to the target of E3.
FROM X:x - (_:e) - Y:y
- Any undirected edge between a member of X and a member of Y.
FROM X:x - (_>:e) - Y:y
- Any right directed edge with source in X and target in Y.
FROM X:x - (<_:e) - Y:y
- Any left directed edge with source in Y and target in X.
FROM X:x - ((<_|_):e) - Y:y
- Any left directed or any undirected. "|" means OR, and parentheses enclose the group of edge descriptors. e is the alias for the edge pattern (<_|_).
FROM X:x - ((E1|E2>|<E3):e) - Y:y
- Any one of the three edge patterns.
FROM X:x - () - Y:y
- any edge (directed or undirected)
- Same as (<_|_>|_)

How To Enter Pattern Match Syntax Mode

To use the pattern match syntax, you need to either set a session parameter or specify it in the query. There are currently two syntax versions for queries:

"v1" is the classic syntax, traversing one hop per SELECT statement. This is the default mode.
"v2" enhances the v1 syntax with pattern matching.

syntax_version Session Parameter

GSQL: Set Syntax Version By A Session Parameter

SET syntax_version="v2"

Query-Level SYNTAX option

CHANGE ADVISORY

The punctuation used with the SYNTAX keyword was streamlined, from CREATE QUERY <query_name><parameters> FOR GRAPH <graph_name> SYNTAX ("v2") # original version, TigerGraph 2.4.0

CREATE QUERY <query_name><parameters> FOR GRAPH <graph_name> SYNTAX v2 # final version, since TigerGraph 2.4.1

GSQL: Set Syntax Version By Specifying The Version After Graph Name In The Query

CREATE QUERY test10 (string str ) FOR GRAPH ldbc_snb SYNTAX v2
{ 
  ...
}

Running Anonymous Queries Without Installing

To send an anonymous query to the interpret engine, replace the keyword CREATE with INTERPRET. Remember, no parameters:

INTERPRET QUERY () FOR GRAPH graph_name SYNTAX v2 { <query body> }

Recommendation: Increase the query timeout threshold.

Interpreted queries may run slower than installed queries, so we recommend increasing the query timeout threshold:

GSQL: Set Longer Timeout

# set query time out to 1 minutes
# 1 unit is 1 milli-second
SET query_timeout = 60000

Examples of 1-Hop Fixed Length Query

Example 1. Find persons who know the person named "Viktor Akhiezer" and return the top 3 oldest such persons.

Example 1. Left Directed Edge Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern. 
   friends = SELECT p
             FROM Seed:s - (<Person_KNOWS_Person:e) - Person:p
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ORDER BY p.birthday ASC
             LIMIT 3;
             
    PRINT  friends[friends.firstName, friends.lastName, friends.birthday];
}

You can copy the above GSQL script to a file named example1.gsql and invoke this script file in Linux.

Linux Bash

gsql example1.gsql

Output of Example 1

{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "developer",
    "api": "v2"
  },
  "results": [{"friends": [
    {
      "v_id": "10995116279461",
      "attributes": {
        "friends.birthday": "1980-05-13 00:00:00",
        "friends.lastName": "Cajes",
        "friends.firstName": "Gregorio"
      },
      "v_type": "Person"
    },
    {
      "v_id": "4398046517846",
      "attributes": {
        "friends.birthday": "1980-04-24 00:00:00",
        "friends.lastName": "Glosca",
        "friends.firstName": "Abdul-Malik"
      },
      "v_type": "Person"
    },
    {
      "v_id": "6597069776731",
      "attributes": {
        "friends.birthday": "1981-02-25 00:00:00",
        "friends.lastName": "Carlsson",
        "friends.firstName": "Sven"
      },
      "v_type": "Person"
    }
  ]}]
}

Example 2. Do the same as Example 1, but use a right-directed edge pattern.

Example 2. Right Directed Edge Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   friends = SELECT s
             FROM Seed:s - (Person_KNOWS_Person>:e) - Person:p
             WHERE p.firstName == "Viktor" AND p.lastName == "Akhiezer"
             ORDER BY s.birthday ASC
             LIMIT 3;

    PRINT  friends[friends.firstName, friends.lastName, friends.birthday];
}

You can copy the above GSQL script to a file named example2.gsql, and invoke this script file in Linux.

Linux Bash

gsql example2.gsql

The output should be the same as example1's output.

Example 3. Find Viktor Akhiezer's total number of comments, total number of posts, and total number of persons he knows. A Person can reach Comments, Posts and other Persons via a directed edge.

Example 3. Right Directed Any Edge Pattern.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @commentCnt= 0;
   SumAccum<int> @postCnt= 0;
   SumAccum<int> @personCnt= 0;

   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   Result = SELECT s
            FROM Seed:s - (_>:e) - :tgt
            WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
            ACCUM CASE WHEN tgt.type == "Comment" THEN
                           s.@commentCnt += 1
                       WHEN tgt.type == "Post" THEN
                           s.@postCnt += 1
                       WHEN tgt.type == "Person" THEN
                           s.@personCnt += 1
                   END;

    PRINT  Result[Result.@commentCnt, Result.@postCnt, Result.@personCnt];
}

You can copy the above GSQL script to a file named example3.gsql, and invoke this script file in Linux.

Linux Bash

gsql example3.gsql

Output of Example 3.

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"Result": [{
    "v_id": "28587302323577",
    "attributes": {
      "Result.@personCnt": 25,
      "Result.@commentCnt": 152,
      "Result.@postCnt": 96
    },
    "v_type": "Person"
  }]}]
}

Example 4. Do the same as Example 3, but use a left-directed edge pattern.

Note below (line 10) that the Seed is now {Person.*, Comment.*, Post.* }, the three types of entities that are targets of edges from a Person.

Example 4. Left Directed Any Edge Pattern

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @commentCnt= 0;
   SumAccum<int> @postCnt= 0;
   SumAccum<int> @personCnt= 0;

   #start with all persons, comments, and posts
   Seed = {Person.*, Comment.*, Post.*};
   #1-hop pattern.
   Result = SELECT tgt
            FROM Seed:s - (<_:e) - Person:tgt
            WHERE tgt.firstName == "Viktor" AND tgt.lastName == "Akhiezer"
            ACCUM  CASE WHEN s.type == "Comment" THEN
                          tgt.@commentCnt += 1
                         WHEN s.type == "Post" THEN
                          tgt.@postCnt += 1
                        WHEN s.type == "Person" THEN
                          tgt.@personCnt += 1
                    END;

    PRINT  Result[Result.@commentCnt, Result.@postCnt, Result.@personCnt];
}

You can copy the above GSQL script to a file named example4.gsql, and invoke this script file in linux command line. The output should be the same as in Example 3.

Example 5. Disjunctive 1-hop edge pattern.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   friends = SELECT p
             FROM Seed:s - ((<Person_KNOWS_Person|Person_KNOWS_Person>):e) - Person:p
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ORDER BY p.birthday ASC
             LIMIT 2;

   PRINT friends;
}

You can copy the above GSQL script to a file named example5.gsql, and invoke this script file in Linux:

Linux Bash

gsql example5.gsql

Output of Example 5.

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"friends": [
    {
      "v_id": "10995116279461",
      "attributes": {
        "birthday": "1980-05-13 00:00:00",
        "firstName": "Gregorio",
        "lastName": "Cajes",
        "gender": "male",
        "speaks": [
          "en",
          "tl"
        ],
        "browserUsed": "Firefox",
        "locationIP": "110.55.251.62",
        "id": 10995116279461,
        "creationDate": "2010-12-16 18:12:57",
        "email": ["Gregorio10995116279461@gmail.com"],
        "@multPropagAcc_1": 0
      },
      "v_type": "Person"
    },
    {
      "v_id": "4398046517846",
      "attributes": {
        "birthday": "1980-04-24 00:00:00",
        "firstName": "Abdul-Malik",
        "lastName": "Glosca",
        "gender": "male",
        "speaks": [
          "ar",
          "en"
        ],
        "browserUsed": "Chrome",
        "locationIP": "109.200.168.137",
        "id": 4398046517846,
        "creationDate": "2010-05-21 00:07:05",
        "email": [
          "Abdul-Malik4398046517846@gmail.com",
          "Abdul-Malik4398046517846@gmx.com",
          "Abdul-Malik4398046517846@land.ru"
        ],
        "@multPropagAcc_1": 0
      },
      "v_type": "Person"
    }
  ]}]
}

Example 6. Find the total comments or posts created by "Viktor Akhiezer". Again, we include two types of edges, but in this case, we count them together.

Example 6. Disjunctive 1-hop edge pattern.

USE GRAPH ldbc_snb
#pattern match syntax version is v2
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @@cnt = 0;
   Seed = {Person.*};

   friends = SELECT t
             FROM Seed:s-((<Comment_HAS_CREATOR_Person|<Post_HAS_CREATOR_Person):e1)-(Comment|Post):t
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ACCUM  @@cnt += 1 ;

  PRINT @@cnt;
}

You can copy the above GSQL script to a file named example6.gsql, and invoke this script file in Linux:

Linux Bash

gsql example6.gsql

Output of Example 6.

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 89}]
}

Repeating a 1-Hop Pattern

A common pattern is the two-step "Friend of a Friend". Or, how many entities might receive a message if it is passed up to three times? Do you have any known change of connections to a celebrity?

(E*) means edge type E repeats any number of times (including zero!)
(E*1..3) means edge type E occurs one to three times.

Below are more illustrative examples:

1-hop star pattern — repetition of an edge pattern 0 or more times
1. FROM X:x - (E1*) - Y:y
2. FROM X:x - (E2>*) - Y:y
3. FROM X:x - (<E3*) - Y:y
4. FROM X:x - (_*) - Y:y
  - Any undirected edge can be chosen at each repetition.
5. FROM X:x - (_>*) - Y:y
  - Any right-directed edge can be chosen at each repetition.
6. FROM X:x - (<_*) - Y:y
  - Any left-directed edge can be chosen at each repetition.
7. FROM X:x - ((E1|E2>|<E3)*) - Y:y
  - Either E1, E2> or <E3 can be chosen at each repetition.
1-hop star pattern with bounds
1. FROM X:x - (E1*2..) - Y:y
  - Lower bounds only. There is a chain of at least 2 E1 edges.
2. FROM X:x - (E2>*..3) - Y:y
  - Upper bounds only. There is a chain of between 0 and 3 E2 edges.
3. FROM X:x - (<E3*3..5) - Y:y
  - Both Lower and Upper bounds. There is a chain of 3 to 5 E3 edges.
4. FROM X:x - ((E1|E2>|<E3)*3) - Y:y
  - Exact bound. There is a chain of exactly 3 edges, where each edge is either E1, E2>, or <E3.

Remarks

No alias allowed for edge with Kleene star An edge alias may not be used when a Kleene star is used. The reason is that when there are a variable number of edges, we cannot associate or bind the alias to a specific edge in the pattern.
Shortest path semantics When an edge is repeated with a Kleene star, only the shortest matching occurrences are selected. See the example below:

In Figure 2, for Pattern 1 - (E>*) - 4, any of the following paths reach 4 from 1.

1->2->3->4
1->2->3->5->6->2->3->4
any path that goes through the cycle 2->3->5->6->2 two or more times and jumps out at 3.

The first path is shorter than the rest; it is considered the only match.

Examples of Variable Hop Queries

Example 1. Find the direct or indirect superclass (including the self class) of the TagClass whose name is "TennisPlayer".

Example 1. Directed Edge Pattern Unconstrained Repetition

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*)-TagClass:t
               WHERE s.name == "TennisPlayer";

    PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example1.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example1.gsql

Note below that the starting vertex s, whose name is TennisPlayer, is also a match, using a path with zero hops.

Output of Example 1

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "0",
      "attributes": {
        "name": "Thing",
        "id": 0,
        "url": "http://www.w3.org/2002/07/owl#Thing"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "59",
      "attributes": {
        "name": "TennisPlayer",
        "id": 59,
        "url": "http://dbpedia.org/ontology/TennisPlayer"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "239",
      "attributes": {
        "name": "Agent",
        "id": 239,
        "url": "http://dbpedia.org/ontology/Agent"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 2. Find the immediate superclass of the TagClass whose name is "TennisPlayer". (This is equivalent to a 1-hop non-repeating pattern.)

Exmaple 2. Exactly 1 Repetition of A Directed Edge

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*1)-TagClass:t
               WHERE s.name == "TennisPlayer";

    PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example2.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example2.gsql

Output of Example 2

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [{
    "v_id": "149",
    "attributes": {
      "name": "Athlete",
      "id": 149,
      "url": "http://dbpedia.org/ontology/Athlete"
    },
    "v_type": "TagClass"
  }]}]
}

Example 3. Find the 1 to 2 hops direct and indirect superclasses of the TagClass whose name is "TennisPlayer".

Example 3. 1 to 2 Repetition Of A Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*1..2)-TagClass:t
               WHERE s.name == "TennisPlayer";

  PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example3.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example3.gsql

Output of Example 3

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 4. Find the superclasses within 2 hops of the TagClass whose name is "TennisPlayer".

Example 4. Up-to 2 Repetition Of A Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =  SELECT t
               FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*..2)-TagClass:t
               WHERE s.name == "TennisPlayer";

  PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example4.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example4.gsql

Output of Example 4

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "59",
      "attributes": {
        "name": "TennisPlayer",
        "id": 59,
        "url": "http://dbpedia.org/ontology/TennisPlayer"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 5. Find the superclasses at least one hop from the TagClass whose name is "TennisPlayer".

Example 5. At Least 1 Repetition Of A Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {

  TagClass1 = {TagClass.*};

  TagClass2 =SELECT t
             FROM TagClass1:s - (TagClass_IS_SUBCLASS_OF_TagClass>*1..)-TagClass:t
             WHERE s.name == "TennisPlayer";

  PRINT  TagClass2;
}

You can copy the above GSQL script to a file named example5.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example5.gsql

Output of Example 5

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"TagClass2": [
    {
      "v_id": "211",
      "attributes": {
        "name": "Person",
        "id": 211,
        "url": "http://dbpedia.org/ontology/Person"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "0",
      "attributes": {
        "name": "Thing",
        "id": 0,
        "url": "http://www.w3.org/2002/07/owl#Thing"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "149",
      "attributes": {
        "name": "Athlete",
        "id": 149,
        "url": "http://dbpedia.org/ontology/Athlete"
      },
      "v_type": "TagClass"
    },
    {
      "v_id": "239",
      "attributes": {
        "name": "Agent",
        "id": 239,
        "url": "http://dbpedia.org/ontology/Agent"
      },
      "v_type": "TagClass"
    }
  ]}]
}

Example 6. Find the 3 most recent comments that are liked or created by Viktor Akhiezer, and the total number of comments related to (created or liked by) Viktor Akhiezer.

Example 6. Disjunctive 1-Repetition Directed Edge.

USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
  SumAccum<int> @@commentCnt = 0;
   
  #start with all persons.
  Seed = {Person.*};

  # find top 3 latest comments that is liked or created by Viktor Akhiezer
  # and the total number of comments related to Viktor Akhiezer
  Top3Comments = SELECT p
    FROM Seed:s - ((<Comment_HAS_CREATOR_Person|Person_LIKES_Comment>)*1) - Comment:p
    WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
    ACCUM @@commentCnt += 1
    ORDER BY p.creationDate DESC
    LIMIT 3;

  PRINT Top3Comments;
  # total number of comments related to Viktor Akhiezer
  PRINT  @@commentCnt;
}

You can copy the above GSQL script to a file named example6.gsql, and invoke this script file in a Linux shell.

Linux Bash

gsql example6.gsql

Output of Example 6

Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [
    {"Top3Comments": [
      {
        "v_id": "2061584720640",
        "attributes": {
          "browserUsed": "Chrome",
          "length": 4,
          "locationIP": "194.62.64.117",
          "id": 2061584720640,
          "creationDate": "2012-09-06 06:46:31",
          "content": "fine"
        },
        "v_type": "Comment"
      },
      {
        "v_id": "2061586872389",
        "attributes": {
          "browserUsed": "Chrome",
          "length": 90,
          "locationIP": "31.216.177.175",
          "id": 2061586872389,
          "creationDate": "2012-08-28 14:54:46",
          "content": "About Hector Berlioz, his compositions Symphonie fantastique and GraAbout Who Knew, the gu"
        },
        "v_type": "Comment"
      },
      {
        "v_id": "2061590804929",
        "attributes": {
          "browserUsed": "Chrome",
          "length": 83,
          "locationIP": "194.62.64.117",
          "id": 2061590804929,
          "creationDate": "2012-09-04 16:16:56",
          "content": "About Muttiah Muralitharan, mit by nine degrees, five degrees being thAbout Steve M"
        },
        "v_type": "Comment"
      }
    ]},
    {"@@commentCnt": 152}
  ]
}