So far, we have described pattern matching as one path pattern in a FROM clause. In this section, we introduce GSQL's capability to match multiplepatterns in one FROM clause. This extension is called Conjunctive Pattern Matching (CPM), because the query asks for the conjunction (logical AND) of the patterns. To get a match, all of the patterns must be satisfied, and the patterns can interrelate. Visually, you can think of patterns formed by a set of intersecting line segments. This feature, introduced as a Beta feature in TigerGraph 3.0, enables you to express complex patterns concisely in a single query block.
In general, a CPM query block consists of multiple patterns in the FROM clause. It has a structure illustrated below.
# Conjunctive Pattern Matching Syntax
SelectBlock :=SELECT alias FROM pattern [sampleClause] [whereClause] [ [perClause] accumClause] [postAccumClause]* ...pattern := vertexPattern | edgePattern | (pathPattern ["," pathPattern])# vertexPattern and edgePattern are from classic GSQL
We elaborate on each of the clause.
SELECT Clause
The SELECT clause selects only one vertex alias from all the patterns in the FROM clause.
FROM Clause - Conjunctive Matching
This is where the conjunctive matching is expressed. The FROM clause consists of a list of path patterns, which are separated by commas. Evaluating each pattern against the underlying graph data produces a match table. If two patterns share a vertex alias, then we form the natural join of the two match tables.
The first pattern's variables are x, e1, y, e2, and z; the second pattern's variables are z, e3, u, e4, and v. Considering the two patterns independently would yield the follwing match table schemas:
#first match table(x, e1, y, e2, z)#second match table(z, e3, u, e4, v)
Natural Join of Match Tables
Natural joining two match tables compares all the shared vertex aliases between the two tables, and the resulting joined table contains all non-shared variables plus one copy of each of the shared vertex variables. Here is the match table for the CPM above:
#natural join result; the shared vertex variable z appears once.(x, e1, y, e2, z, e3, u, e4, v)
Valid Conjunctive Patterns
The match table of the conjunctive pattern match is the natural join of all the patterns' match tables. By design, a row in the CPM match table must simultaneously satisfy all the match tables.
If the match tables of the patterns in a FROM clause can be naturally joined into one match table, then the FROM clause has a valid CPM input. Otherwise, the FROM clause has an invalid pattern input list.
For example, below we show two valid CPM inputs and one invalid CPM input.
# a valid CPM, since the two patterns natrually joinon :tgtSELECTFROM Person:p - (KNOWS) - :tgt, Post:s -(<LIKES) - :tgt # a valid CPM, since the two patterns naturally joinon :fSELECTFROM Person:p - (KNOWS) - :f - (LIKES>) - Post:tgt, :f - (LIKES>) - Comment:c # an invalid CPM, since the two patterns do not share any vertex variables.# they cannot be naturally joined. # One pattern has (p, tgt); the other has (s, t). SELECTFROM Person:p - (KNOWS) - :tgt, Post:s - (<LIKES) - Person:t
WHERE Clause
The predicates in the WHERE clause can use any of the vertex or edge aliases in any of the patterns, including predicates which combine variables from different constituent paths. CPM queries do not have any special restrictions on the WHERE predicate. Distance matters, however, for performance. Conditions that are local, measured both cross-path and within-path, can be resolved earlier and therefore are faster.
In the example below, x2.age > x4.age is a cross-pattern predicate, e1.timestamp < e3.timestamp is a cross-pattern predicate, and x1.gender == x4.gender is a local predicate of the second pattern.
FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3, X1:x1-(E3:e3)-X4:x4WHERE x2.age > x4.age AND e1.timestamp < e3.timestamp AND x1.gender == x4.gender
ACCUM Clause
You can ACCUM to any vertex variable in a CPM block.
The ACCUM clause by default will execute as many timesas the row (match) count of the CPM match table; each execution uses one row from the match table.
ACCUM To The Three Vertex Variables of A CPM Pattern
#accum to x1, x2 and x4. FROM X1:x1-(E1:e1)-X2:x2-(E2:e2)-X3:x3, X1:x1-(E3:e3)-X4:x4ACCUM x1.@cnt +=1, x2.@cnt += x3.quantity, x4.@cnt += x3.quantity
POST-ACCUM Clause
POST-ACCUM for CPM behaves the same as POST-ACCUM for single path patterns. That is, each POST-ACCUM clause can refer to one vertex alias, and the clause executes iteratively over that vertex set (e.g. one vertex column in the matching table). Its statements can access the aggregated accumulator result computed in the ACCUM clause. The query can have multiple POST-ACCUM clauses, one for each vertex alias you wish to work on. The multiple POST-ACCUM clauses are processed in parallel; it doesn't matter in what order you write them. (For each binding, the statements within a clause are executed in order.)
For example, below we have three POST-ACCUM clauses. The first one iterates throughx1, and for each x1, we do @@cnt += x1.@cnt. The second and third POST-ACCUMs iterate through x2 and x3 respectively, and accumulates their @cnt accumulator value into @@cnt.
POST-ACCUM to a global accumulator @@cnt, using three CPM Vertex Variables
Example 1. Find Viktor Akhiezer's liked messages (100+ days after their creation) whose author's last name begin with letter S. Output the message's forum.
Example 2. Find any authors who wrote posts that Viktor Akhiezer's liked and whose last name begins with S. Find the country for each of these authors and report on the countries.
Example 3. Given a TagClass and a Country, find all the Forums created in the given Country, containing at least one Post with Tags belonging directly to the given TagClass. The location of a Forum is identified by the location of the Forum’s moderator.
Example 4. For a given country, count all the distinct triples of Persons such that:
a is a friend of b.
b is a friend of c
c is a friend of a.
Distinct means that if a certain 3 vertices appear once in the results, it will not be repeated: it will appear only once. KNOWS is an undirected relationship, so it doesn't matter in what order we list the 3 vertices.
More Examples. We translated LDBC-SNB BI and IC queries using CPM, and shared the translation in github. Please refer to the query translation here. Most of the queries are installed as functions, you can find sample parameter(s) of the functions from here.
Source Vertex Set Flexibility
As mentioned when we first described pattern matching, in One-hop patterns, the source (leftmost) vertex set can be a vertex type, an alternation of types, or even omitted.
Example 1. Find Viktor Akhiezer's favorite messages' creators whose last name begins with letter S. Count them.