Strongly Connected Components
A strongly connected component (SCC) is a subgraph such that there is a path from any vertex to every other vertex. A graph can contain more than one separate SCC. An SCC algorithm finds the maximal SCCs within a graph. Our implementation is based on the Divide-and-Conquer Strong Components (DCSC) algorithm[1]. In each iteration, pick a pivot vertex v
randomly, and find its descendant and predecessor sets, where descendant set D_v
is the vertex reachable from v
, and predecessor set P_v
is the vertices which can reach v
(stated another way, reachable from v
through reverse edges). The intersection of these two sets is a strongly connected component SCC_v
. The graph can be partitioned into 4 sets: SCC_v
, descendants D_v
excluding SCC_v
, predecessors P_v
excluding SCC
, and the remainders R_v
. It is proved that any SCC is a subset of one of the 4 sets [1]. Thus, we can divide the graph into different subsets and detect the SCCs independently and iteratively.
The problem of this algorithm is unbalanced load and slow convergence when there are a lot of small SCCs, which is often the case in real-world use cases [3]. We added two trimming stages to improve the performance: size-1 SCC trimming[2] and weakly connected components[3].
The implementation of this algorithm requires reverse edges for all directed edges considered in the graph.
[1] Fleischer, Lisa K., Bruce Hendrickson, and Ali Pınar. "On identifying strongly connected components in parallel." International Parallel and Distributed Processing Symposium. Springer, Berlin, Heidelberg, 2000.
[2] Mclendon Iii, William, et al. "Finding strongly connected components in distributed graphs." Journal of Parallel and Distributed Computing 65.8 (2005): 901-910.
[3] Hong, Sungpack, Nicole C. Rodia, and Kunle Olukotun. "On fast parallel detection of strongly connected components (SCC) in small-world graphs." Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. ACM, 2013.
Specifications
Characteristic | Value |
Result | Assigns a component id (INT) to each vertex, such that members of the same component have the same id value. |
Input Parameters |
|
Result Size | V = number of vertices |
Time Complexity | O(iter*d), d = max(diameter of components) |
Graph Types | Directed edges with reverse direction edges as well |
Example
We ran scc
on the social26 graph. A portion of the JSON result is shown below.
The first element "i"=1
means the whole graph is processed in just one iteration. The 5 "trim_set.size()"
elements mean there were 5 rounds of size-1 SCC trimming. The final "@@.cluster_dist_heap" object"
reports on the size distribution of SCCs.There is one SCC with 9 vertices, and 17 SCCs with only 1 vertex in the graph.
Last updated