One-Hop Patterns

Introduction

Pattern matching by nature is declarative. It enables users to focus on specifying what they want from a query without worrying about the underlying query processing.

A pattern usually appears in the FROM clause, the most fundamental part of the query structure. The pattern specifies sets of vertex types and how they are connected by edge types. A pattern can be refined further with conditions in the WHERE clause. In this tutorial, we'll focus on the linear pattern.

Currently, pattern matching may only be used in read-only queries.

1-Hop Pattern

The easiest way to understand patterns is to start with a simple 1-Hop pattern. Even a single hop has several options. After we've tackled single hops, then we'll see how to add repetition to make variable length patterns and how to connect single hops to form bigger patterns.

In classic GSQL queries, described in GSQL 101, we used the punctuation -( )-> in the FROM clause to indicate a 1-hop query, where the arrow specifies the vertex flow from left to right, and ( ) encloses the edge types.

Person:p -(LIKES:e)-> Message:m          /* Classic GSQL example */

In pattern matching, we use the punctuation -( )- to denote a 1-hop pattern, where the edge type(s) is enclosed in the parentheses () and the hyphens - symbolize connection without specifying direction. Instead, directionality is explicitly stated for each edge type.

  • For an undirected edge E, no added decoration: E

  • For a directed edge E from left to right, use a suffix: E>

  • For a directed edge E from right to left, use a prefix: <E

For example, in the LDBC SNB schema, there are two directed relationships between Person and Message: person LIKES message, and message HAS_CREATOR person. Despite the fact that these relationships are in opposite directions, we can include both of them in the same pattern very concisely:

Person:p -((LIKES>|<HAS_CREATOR):e)- Message:m         /* Pattern example */

Edge Type Wildcards

The underscore _ is a wildcard meaning any edge type. Arrowheads are still used to indicate direction, e.g., _> or <_ or _ The empty parentheses () means any edge, directed or undirected.

Examples of 1-Hop Patterns

  1. FROM X:x - (E1:e1) - Y:y

    • E1 is an undirected edge. x and y bind to the end points of E1. e1 is the alias of E1.

  2. FROM x - (E2>:e2) - Y:y

    • Right directed edge, x binds to the source of E2, y binds to the target of E2.

  3. FROM X:x - (<E3:e3) - Y:y

    • Left directed edge, y binds to the source of E3, x binds to the target of E3.

  4. FROM X:x - (_:e) - Y:y

    • Any undirected edge between a member of X and a member of Y.

  5. FROM X:x - (_>:e) - Y:y

    • Any right directed edge with source in X and target in Y.

  6. FROM X:x - (<_:e) - Y:y

    • Any left directed edge with source in Y and target in X.

  7. FROM X:x - ((<_|_):e) - Y:y

    • Any left directed or any undirected. "|" means OR, and parentheses enclose the group of edge descriptors. e is the alias for the edge pattern (<_|_).

  8. FROM X:x - ((E1|E2>|<E3):e) - Y:y

    • Any one of the three edge patterns.

  9. FROM X:x - () - Y:y

    • any edge (directed or undirected)

    • Same as (<_|_>|_)

How To Enter Pattern Match Syntax Mode

To use the pattern match syntax, you need to either set a session parameter or specify it in the query. There are currently two syntax versions for queries:

  • "v1" is the classic syntax, traversing one hop per SELECT statement. This is the default mode.

  • "v2" enhances the v1 syntax with pattern matching.

syntax_version Session Parameter

You can use the SET command to assign a value to the syntax_version session parameter: v1 for classic syntax; v2 for pattern matching. If the parameter is never set, the classic v1 syntax is enabled. Once set, the selection remains valid for the duration of the GSQL client session, or until it is changed with another SET command.

GSQL: Set Syntax Version By A Session Parameter
SET syntax_version="v2"

Query-Level SYNTAX option

You can also select the syntax by using the new SYNTAX option in the CREATE QUERY statement: v1 for classic syntax (default); v2 for pattern matching. The Query-Level SYNTAX option overrides the syntax_version session parameter.

GSQL: Set Syntax Version By Specifying The Version After Graph Name In The Query
CREATE QUERY test10 (string str ) FOR GRAPH ldbc_snb SYNTAX v2
{ 
  ...
}

Running Anonymous Queries Without Installing

In this tutorial, we will use the new Interpreted Mode for GSQL, also introduced in TigerGraph 2.4. Interpreted mode lets us skip the INSTALL step, and even to run a query as soon as we create it, to offer a more interactive experience. These one-step interpreted queries are unnamed (anonymous) and parameterless, just like SQL.

To send an anonymous query to the interpret engine, replace the keyword CREATE with INTERPRET. Remember, no parameters:

INTERPRET QUERY () FOR GRAPH graph_name SYNTAX v2 { <query body> }

Recommendation: Increase the query timeout threshold.

Interpreted queries may run slower than installed queries, so we recommend increasing the query timeout threshold:

GSQL: Set Longer Timeout
# set query time out to 1 minutes
# 1 unit is 1 milli-second
SET query_timeout = 60000

Examples of 1-Hop Fixed Length Query

Example 1. Find persons who know the person named "Viktor Akhiezer" and return the top 3 oldest such persons.

Example 1. Left Directed Edge Pattern
USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern. 
   friends = SELECT p
             FROM Seed:s - (<Person_KNOWS_Person:e) - Person:p
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ORDER BY p.birthday ASC
             LIMIT 3;
             
    PRINT  friends[friends.firstName, friends.lastName, friends.birthday];
}

You can copy the above GSQL script to a file named example1.gsql and invoke this script file in Linux.

Linux Bash
gsql example1.gsql
Output of Example 1
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "developer",
    "api": "v2"
  },
  "results": [{"friends": [
    {
      "v_id": "10995116279461",
      "attributes": {
        "friends.birthday": "1980-05-13 00:00:00",
        "friends.lastName": "Cajes",
        "friends.firstName": "Gregorio"
      },
      "v_type": "Person"
    },
    {
      "v_id": "4398046517846",
      "attributes": {
        "friends.birthday": "1980-04-24 00:00:00",
        "friends.lastName": "Glosca",
        "friends.firstName": "Abdul-Malik"
      },
      "v_type": "Person"
    },
    {
      "v_id": "6597069776731",
      "attributes": {
        "friends.birthday": "1981-02-25 00:00:00",
        "friends.lastName": "Carlsson",
        "friends.firstName": "Sven"
      },
      "v_type": "Person"
    }
  ]}]
}

Example 2. Do the same as Example 1, but use a right-directed edge pattern.

Example 2. Right Directed Edge Pattern
USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   friends = SELECT s
             FROM Seed:s - (Person_KNOWS_Person>:e) - Person:p
             WHERE p.firstName == "Viktor" AND p.lastName == "Akhiezer"
             ORDER BY s.birthday ASC
             LIMIT 3;

    PRINT  friends[friends.firstName, friends.lastName, friends.birthday];
}

You can copy the above GSQL script to a file named example2.gsql, and invoke this script file in Linux.

Linux Bash
gsql example2.gsql

The output should be the same as example1's output.

Example 3. Find Viktor Akhiezer's total number of comments, total number of posts, and total number of persons he knows. A Person can reach Comments, Posts and other Persons via a directed edge.

Example 3. Right Directed Any Edge Pattern.
USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @commentCnt= 0;
   SumAccum<int> @postCnt= 0;
   SumAccum<int> @personCnt= 0;

   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   Result = SELECT s
            FROM Seed:s - (_>:e) - :tgt
            WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
            ACCUM CASE WHEN tgt.type == "Comment" THEN
                           s.@commentCnt += 1
                       WHEN tgt.type == "Post" THEN
                           s.@postCnt += 1
                       WHEN tgt.type == "Person" THEN
                           s.@personCnt += 1
                   END;

    PRINT  Result[Result.@commentCnt, Result.@postCnt, Result.@personCnt];
}

You can copy the above GSQL script to a file named example3.gsql, and invoke this script file in Linux.

Linux Bash
gsql example3.gsql
Output of Example 3.
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"Result": [{
    "v_id": "28587302323577",
    "attributes": {
      "Result.@personCnt": 25,
      "Result.@commentCnt": 152,
      "Result.@postCnt": 96
    },
    "v_type": "Person"
  }]}]
}

Example 4. Do the same as Example 3, but use a left-directed edge pattern.

Note below (line 10) that the Seed is now {Person.*, Comment.*, Post.* }, the three types of entities that are targets of edges from a Person.

In the current version, the vertex set on the left side of the pattern must be defined in a previous statement (e.g., a seed statement), the same requirement as in v1 syntax FROM clauses. In the example below, the current version of pattern matching would not permit FROM _:s -(<:e) - Person:tgt

Example 4. Left Directed Any Edge Pattern
USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @commentCnt= 0;
   SumAccum<int> @postCnt= 0;
   SumAccum<int> @personCnt= 0;

   #start with all persons, comments, and posts
   Seed = {Person.*, Comment.*, Post.*};
   #1-hop pattern.
   Result = SELECT tgt
            FROM Seed:s - (<_:e) - Person:tgt
            WHERE tgt.firstName == "Viktor" AND tgt.lastName == "Akhiezer"
            ACCUM  CASE WHEN s.type == "Comment" THEN
                          tgt.@commentCnt += 1
                         WHEN s.type == "Post" THEN
                          tgt.@postCnt += 1
                        WHEN s.type == "Person" THEN
                          tgt.@personCnt += 1
                    END;

    PRINT  Result[Result.@commentCnt, Result.@postCnt, Result.@personCnt];
}

You can copy the above GSQL script to a file named example4.gsql, and invoke this script file in linux command line. The output should be the same as in Example 3.

Example 5. Find the two oldest persons who either know "Viktor Akhiezer" or are known by "Vicktor Akhiezer". KNOWS is a directed relationship, so we need to include both directions in the pattern.

Example 5. Disjunctive 1-hop edge pattern.
USE GRAPH ldbc_snb
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   #start with all persons.
   Seed = {Person.*};
   #1-hop pattern.
   friends = SELECT p
             FROM Seed:s - ((<Person_KNOWS_Person|Person_KNOWS_Person>):e) - Person:p
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ORDER BY p.birthday ASC
             LIMIT 2;

   PRINT friends;
}

You can copy the above GSQL script to a file named example5.gsql, and invoke this script file in Linux:

Linux Bash
gsql example5.gsql
Output of Example 5.
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"friends": [
    {
      "v_id": "10995116279461",
      "attributes": {
        "birthday": "1980-05-13 00:00:00",
        "firstName": "Gregorio",
        "lastName": "Cajes",
        "gender": "male",
        "speaks": [
          "en",
          "tl"
        ],
        "browserUsed": "Firefox",
        "locationIP": "110.55.251.62",
        "id": 10995116279461,
        "creationDate": "2010-12-16 18:12:57",
        "email": ["Gregorio10995116279461@gmail.com"],
        "@multPropagAcc_1": 0
      },
      "v_type": "Person"
    },
    {
      "v_id": "4398046517846",
      "attributes": {
        "birthday": "1980-04-24 00:00:00",
        "firstName": "Abdul-Malik",
        "lastName": "Glosca",
        "gender": "male",
        "speaks": [
          "ar",
          "en"
        ],
        "browserUsed": "Chrome",
        "locationIP": "109.200.168.137",
        "id": 4398046517846,
        "creationDate": "2010-05-21 00:07:05",
        "email": [
          "Abdul-Malik4398046517846@gmail.com",
          "Abdul-Malik4398046517846@gmx.com",
          "Abdul-Malik4398046517846@land.ru"
        ],
        "@multPropagAcc_1": 0
      },
      "v_type": "Person"
    }
  ]}]
}

Example 6. Find the total comments or posts created by "Viktor Akhiezer". Again, we include two types of edges, but in this case, we count them together.

Example 6. Disjunctive 1-hop edge pattern.
USE GRAPH ldbc_snb
#pattern match syntax version is v2
SET syntax_version="v2"

INTERPRET QUERY () FOR GRAPH ldbc_snb {
   SumAccum<int> @@cnt = 0;
   Seed = {Person.*};

   friends = SELECT t
             FROM Seed:s-((<Comment_HAS_CREATOR_Person|<Post_HAS_CREATOR_Person):e1)-(Comment|Post):t
             WHERE s.firstName == "Viktor" AND s.lastName == "Akhiezer"
             ACCUM  @@cnt += 1 ;

  PRINT @@cnt;
}

You can copy the above GSQL script to a file named example6.gsql, and invoke this script file in Linux:

Linux Bash
gsql example6.gsql
Output of Example 6.
Using graph 'ldbc_snb'
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@cnt": 89}]
}

Last updated