Cosine Similarity of Neighborhoods (All Pairs, Batch)
This algorithm computes the same similarity scores as the Cosine similarity of neighborhoods, single source algorithm.
Instead of selecting a single source vertex, however, it calculates similarity scores for all vertex pairs in the graph in parallel.
Since this is a memory-intensive operation, it is split into batches to reduce peak memory usage. The user can specify how many batches it is to be split into.
Specifications
CREATE QUERY tg_cosine_nbor_ap_batch(STRING vertex_type, STRING edge_type,
STRING edge_attribute, INT top_k, BOOL print_results = true,
STRING file_path, STRING similarity_edge, INT num_of_batches = 1)
Parameters
Name | Description |
---|---|
|
Vertex type to calculate similarity for |
|
Directed edge type to traverse |
|
Name of the attribute on the edge type to use as the weight |
|
Number of top scores to report for each vertex |
|
If |
|
If provided, the similarity score will be saved to this edge. |
|
If not empty, write output to this file in CSV. |
|
Number of batches to divide the query into |
Example
Using the social10
graph, we can calculate the cosine similarity of every person to every other person connected by the Friend
edge, and print out the top k most similar pairs for each vertex.
We run tg_cosine_batch("Person", "Friend", "weight", 5, true, "", "", 1)
:
[
{
"start": [
{
"attributes": {
"start.@heap": [
{
"val": 0.49903,
"ver": "Howard"
},
{
"val": 0.43938,
"ver": "George"
},
{
"val": 0.05918,
"ver": "Alex"
},
{
"val": 0.05579,
"ver": "Ivy"
}
]
},
"v_id": "Fiona",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": []
},
"v_id": "Justin",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": []
},
"v_id": "Bob",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": [
{
"val": 0.22361,
"ver": "Bob"
},
{
"val": 0.21213,
"ver": "Alex"
}
]
},
"v_id": "Chase",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": [
{
"val": 0.57143,
"ver": "Bob"
},
{
"val": 0.12778,
"ver": "Chase"
}
]
},
"v_id": "Damon",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": []
},
"v_id": "Alex",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": [
{
"val": 0.64253,
"ver": "Alex"
},
{
"val": 0.63607,
"ver": "Ivy"
},
{
"val": 0.27091,
"ver": "Howard"
},
{
"val": 0.14364,
"ver": "Fiona"
}
]
},
"v_id": "George",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": []
},
"v_id": "Eddie",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": [
{
"val": 0.94848,
"ver": "Fiona"
},
{
"val": 0.6364,
"ver": "Alex"
},
{
"val": 0.31046,
"ver": "George"
},
{
"val": 0.1118,
"ver": "Howard"
}
]
},
"v_id": "Ivy",
"v_type": "Person"
},
{
"attributes": {
"start.@heap": [
{
"val": 1.09162,
"ver": "Fiona"
},
{
"val": 0.78262,
"ver": "Ivy"
},
{
"val": 0.11852,
"ver": "George"
}
]
},
"v_id": "Howard",
"v_type": "Person"
}
]
}
]