WCC (Small-World Optimized)

In addition to the regular weakly connected component algorithm, we also provide a version that is optimized for small-world graphs. A small world graph means the graph has a hub community, where the vast majority of the vertices in the graph are weakly connected.

This version improves upon the performance of the original algorithm when dealing with small-world graphs by combining several different methods used to find connected components in a multi-step process proposed by Slota et al. in BFS and Coloring-based Parallel Algorithms for Strongly Connected Components and Related Problems.

The algorithm starts by selecting an initial pivot vertex v with a high product of indegree and outdegree. From the initial pivot vertex, the algorithm uses Breadth-First Search to determine the massive weakly connected component. The vertices that are not included in this SCC are passed off to the next step.

After identifying the first WCC, the algorithm uses the coloring method to identify the WCCs in the remaining vertices.

Specifications

CREATE QUERY tg_wcc_small_world(STRING v_type, STRING e_type,
UINT threshold = 100000, BOOL to_show_cc_count=FALSE)

Time complexity

The algorithm has a time complexity of \(O(E*d)\), where \(E\) is the number of edges and \(d\) is max(diameter of components).

Parameters

Name Description Data type

v_type

The vertex type to count as part of a connected component

STRING

e_type

The edge type to traverse

STRING

threshold

The threshold used to choose initial pivot vertices. Only vertices whose product of indegree and outdegree exceed this threshold will be considered candidates for the pivot vertex. This is an attempt to increase the chances that the initial pivot is contained within the largest WCC. The default value for this parameter is 100000. It is suggested that you keep this default value when running the algorithm.

UINT

to_show_cc_count

If true, the algorithm will return the number of vertices in each connected component.

BOOL

Result

If to_show_cc_count is set to true, the algorithm will return the number of vertices in each weakly connected component.

Example

Suppose we have the following graph. We can see that there are three connected components. The first one has 5 vertices, while the two others have 3 vertices.

img

  • GSQL Command

  • Result

RUN QUERY tg_wcc_small_world("Person", "Coworker", _, true)
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"@@CC_count": {
    "1048576": 5,
    "1048577": 3,
    "4194306": 3
  }}]
}