Example - A Recommender
We have demonstrated the basic pattern match syntax. You should fully understand the basics by this point. In this section, we show two end-to-end solutions using the pattern match syntax.
A Recommendation Application
In this example, we want to recommend some messages (comments or posts) to the person Viktor Akhiezer.
How do we do this?
One way is to find others who like the same messages Viktor likes, then recommend the messages that Others like but Viktor has not seen. The pattern can be sketched out as follows:
-
Viktor - (Likes>) - Message - (<Likes) - Others
-
Others - (Likes>) - NewMessage
-
Recommend NewMessage to Viktor
However, this is too granular. We are overfitting the message-level data with a collaborative filtering algorithm.
Intuitively, two persons are similar to each other when their "liked" messages fall into the same category - here represented by the set of tags attached to each message.
As a result, one way to avoid overfitting is to go one level upward. Instead of looking at common messages, we look at their tags. We consider Person A and Person B similar if they like messages that belong to the same tag. This scheme fixes the overfitting problem. In pattern match vocabulary, we have
-
Viktor - (Likes>) - Message - (Has>) - Tag - (<Has) - Message - (<Likes) - Others
-
Others - (Likes>) - NewMessage
-
Recommend NewMessage to Viktor
GSQL recommend_message Application
This time, we create the query first and interpret the query by calling the query name with parameters.
If we are satisfied with this query, we can use INSTALL QUERY queryName
to install the query, increasing performance.
CREATE QUERY recommend_message (STRING fn, STRING ln) SYNTAX v2{
SumAccum<int> @tag_in_common;
SumAccum<float> @similarity_score;
SumAccum<float> @rank;
OrAccum @Liked = false;
// 1. mark messages liked by Viktor
// 2. calculate log similarity score for each persons share the same
// interests at Tag level.
Others =
SELECT p
FROM Person:s-(Likes>)-:msg - (Has_Tag>.<Has_Tag.<Likes)- :p
WHERE s.first_name == fn AND s.last_name == ln
ACCUM msg.@Liked = true, p.@tag_in_common +=1
POST-ACCUM p.@similarity_score = log (1 + p.@tag_in_common);
// recommend new messages to Viktor that have not been liked by him.
recommended_message =
SELECT msg
FROM Others:o-(Likes>) - :msg
WHERE msg.@Liked == false
ACCUM msg.@rank +=o.@similarity_score
ORDER BY msg.@rank DESC
LIMIT 2;
PRINT recommended_message[recommended_message.content, recommended_message.@rank];
}
INTERPRET QUERY recommend_message ("Viktor", "Akhiezer")
// try the second person with just parameter change.
INTERPRET QUERY recommend_message ("Adriaan", "Jong")
{
"error": false,
"message": "",
"version": {
"schema": 0,
"edition": "enterprise",
"api": "v2"
},
"results": [{"recommended_message": [
{
"v_id": "549760294602",
"attributes": {
"recommended_message.@rank": 4855.48975,
"recommended_message.content": "About Indira Gandhi, Gandhi established closer relatAbout Mick Jagger, eer of the band. In 1989, he waAbout Ho Chi Minh, ce Unit and ECA International, About Ottoman Empire, After t"
},
"v_type": "Post"
},
{
"v_id": "549760292109",
"attributes": {
"recommended_message.@rank": 4828.72168,
"recommended_message.content": "About Ho Chi Minh, nam, as an anti-communist state, fought against the communisAbout Shiny Happy People, sale in the U."
},
"v_type": "Post"
}
]}]
}
Install the query
When you are satisfied with your query in the GSQL interpreted mode, you can install it as a generic service. This speeds up querying considerably.
Since we have been using CREATE QUERY …
syntax, the query is added into the catalog, and we can set the syntax version and install it.
// before installing the query, need to set the syntax version
SET syntax_version="v2"
USE GRAPH ldbc_snb
// install query
INSTALL QUERY recommend_message
GSQL > INSTALL QUERY recommend_message
Start installing queries, about 1 minute ...
recommend_message query: curl -X GET 'http://127.0.0.1:14240/restpp/query/ldbc_snb/recommend_message?fn=VALUE&ln=VALUE'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.
[========================================================================================================] 100% (1/1)
GSQL > run query recommend_message("Viktor", "Akhiezer")
{
"error": false,
"message": "",
"version": {
"schema": 0,
"edition": "enterprise",
"api": "v2"
},
"results": [{"recommended_message": [
{
"v_id": "549760294602",
"attributes": {
"recommended_message.@rank": 4855.49219,
"recommended_message.content": "About Indira Gandhi, Gandhi established closer relatAbout Mick Jagger, eer of the band. In 1989, he waAbout Ho Chi Minh, ce Unit and ECA International, About Ottoman Empire, After t"
},
"v_type": "Post"
},
{
"v_id": "549760292109",
"attributes": {
"recommended_message.@rank": 4828.7251,
"recommended_message.content": "About Ho Chi Minh, nam, as an anti-communist state, fought against the communisAbout Shiny Happy People, sale in the U."
},
"v_type": "Post"
}
]}]
}
The previous example uses log-cosine as a similarity measurement. We can also use cosine similarity by using two persons' liked messages.
CREATE QUERY recommend_message (STRING fn, STRING ln) SYNTAX v2{
SumAccum<int> @msg_in_common = 0;
SumAccum<int> @msg_cnt = 0 ;
SumAccum<int> @@input_person_msg_cnt= 0;
SumAccum<float> @similarity_score;
SumAccum<float> @rank;
SumAccum<float> @tag_cnt = 0;
OrAccum @Liked = false;
float sqrt_of_input_person_msg_cnt;
//1. mark messages liked by input user
//2. find common messages between input user and other persons
Others =
SELECT p
FROM Person:s-(Likes>)-:msg -(<Likes)-:p
WHERE s.first_name == fn AND s.last_name == ln
ACCUM msg.@Liked = true, @@input_person_msg_cnt+= 1,
p.@msg_in_common += 1;
sqrt_of_input_person_msg_cnt = sqrt(@@input_person_msg_cnt);
//calculate cosine similarity score.
//|AxB|/(sqrt(Sum(A_i^2)) * sqrt(Sum(B_i^2)))
Others =
SELECT o
FROM Others:o-(Likes>)-:msg
ACCUM o.@msg_cnt += 1
POST-ACCUM o.@similarity_score = o.@msg_in_common/(sqrt_of_input_person_msg_cnt * sqrt(o.@msg_cnt));
//recommend new messages to input user that have not been liked by him.
recommended_message =
SELECT msg
FROM Others:o-(Likes>) - :msg
WHERE msg.@Liked == false
ACCUM msg.@rank +=o.@similarity_score
ORDER BY msg.@rank DESC
LIMIT 3;
PRINT recommended_message[recommended_message.content, recommended_message.@rank];
}
INTERPRET QUERY recommend_message ("Viktor", "Akhiezer")
// try the second person with just parameter change.
INTERPRET QUERY recommend_message ("Adriaan", "Jong")