Similarity Algorithms

The Graph Data Science Library provides two types of similarity algorithms or functions:

Neighborhood Similarity Algorithms

Two vertices are neighbors of one another if they are directly connected by an edge. The neighborhood of a vertex is its set of neighbors.

Neighborhood similarity algorithms measure the degree to which the neighborhoods of vertices contain the same or similar members. If two vertices have exactly the same neighbors, then they have perfect neighborhood similarity.

The TigerGraph Graph Data Science Library provides the following neighborhood similarity algorithms:

Cosine Similarity of Neighborhoods
- Cosine Similarity of Neighborhoods (All Pairs, Batch)
- Cosine Similarity of Neighborhoods (Single-Source)
Jaccard Similarity of Neighborhoods
- Jaccard Similarity of Neighborhoods (All Pairs, Batch)
- Jaccard Similarity of Neighborhoods (Single Source)

Vector Similarity Functions

The TigerGraph Graph Data Science Library also provides vector similarity functions. The vectors can be either LIST-type attributes or in-query ListAccum accumulators.

The functions are implemented as options of one master function, tg_similarity_accum(VectorA, VectorB, function_specifier_string).

Because they are functions, they are documented in the GSQL Reference Manual section on Vector Functions.

Technically, these are functions, not algorithms. Whereas a GSQL algorithm IS a query, a function can be used someplace within a query, to compute a value.
These vector functions look only at lists or sets of values. They do not look at vertices or edges.
These vector functions are currently implemented as pre-installed user-defined functions (UDFs) within the Graph Data Science Library. It is not necessary to install any particular algorithm in order to use these functions.