The Strongly Connected Components algorithm
This is documentation for the Graph Algorithms Library, which has been deprecated by the Graph Data Science Library (GDS). 
This section describes the Strongly Connected Components algorithm in the Neo4j Labs Graph Algorithms library.
The Strongly Connected Components (SCC) algorithm finds sets of connected nodes in a directed graph where each node is reachable in both directions from any other node in the same set. It is often used early in a graph analysis process to help us get an idea of how our graph is structured.
The Strongly Connected Components algorithm was developed by the Neo4j Labs team and is not officially supported. 
This section includes:
1. History and explanation
SCC is one of the earliest graph algorithms, and the first lineartime algorithm was described by Tarjan in 1972. Decomposing a directed graph into its strongly connected components is a classic application of the depthfirst search algorithm.
2. Usecases  when to use the Strongly Connected Components algorithm

In the analysis of powerful transnational corporations, SCC can be used to find the set of firms in which every member owns directly and/or indirectly owns shares in every other member. Although it has benefits, such as reducing transaction costs and increasing trust, this type of structure can weaken market competition. Read more in "The Network of Global Corporate Control".

SCC can be used to compute the connectivity of different network configurations when measuring routing performance in multihop wireless networks. Read more in "Routing performance in the presence of unidirectional links in multihop wireless networks"

Strongly Connected Components algorithms can be used as a first step in many graph algorithms that work only on strongly connected graph. In social networks, a group of people are generally strongly connected (For example, students of a class or any other common place). Many people in these groups generally like some common pages, or play common games. The SCC algorithms can be used to find such groups, and suggest the commonly liked pages or games to the people in the group who have not yet liked those pages or games.
3. Strongly Connected Components algorithm sample
A directed graph is strongly connected if there is a path between all pairs of vertices. This algorithm treats the graph as directed, which means that the direction of the relationship is important. A strongly connected component only exists if there are relationships between nodes in both direction.
MERGE (nAlice:User {id:'Alice'})
MERGE (nBridget:User {id:'Bridget'})
MERGE (nCharles:User {id:'Charles'})
MERGE (nDoug:User {id:'Doug'})
MERGE (nMark:User {id:'Mark'})
MERGE (nMichael:User {id:'Michael'})
MERGE (nAlice)[:FOLLOW]>(nBridget)
MERGE (nAlice)[:FOLLOW]>(nCharles)
MERGE (nMark)[:FOLLOW]>(nDoug)
MERGE (nMark)[:FOLLOW]>(nMichael)
MERGE (nBridget)[:FOLLOW]>(nMichael)
MERGE (nDoug)[:FOLLOW]>(nMark)
MERGE (nMichael)[:FOLLOW]>(nAlice)
MERGE (nAlice)[:FOLLOW]>(nMichael)
MERGE (nBridget)[:FOLLOW]>(nAlice)
MERGE (nMichael)[:FOLLOW]>(nBridget);
CALL algo.scc('User','FOLLOW', {write:true,partitionProperty:'partition'})
YIELD loadMillis, computeMillis, writeMillis, setCount, maxSetSize, minSetSize;
Name  Partition 

Alice 
1 
Bridget 
1 
Michael 
1 
Charles 
0 
Doug 
2 
Mark 
2 
We have 3 strongly connected components in our sample graph.
The first, and biggest, component has members Alice, Bridget, and Michael, while the second component has Doug and Mark. Charles ends up in his own component because there isn’t an outgoing relationship from that node to any of the others.
MATCH (u:User)
RETURN u.partition as partition,count(*) as size_of_partition
ORDER by size_of_partition DESC
LIMIT 1
4. Huge graph projection
The default label and relationshiptype projection has a limitation of 2 billion nodes and 2 billion relationships. Therefore, if our projected graph contains more than 2 billion nodes or relationships, we will need to use huge graph projection.
graph:'huge'
in the config:CALL algo.scc('User','FOLLOW', {graph:'huge'})
YIELD loadMillis, computeMillis, writeMillis, setCount;
5. Cypher projection
graph:'cypher'
in the config:CALL algo.scc(
'MATCH (u:User) RETURN id(u) as id',
'MATCH (u1:User)[:FOLLOW]>(u2:User) RETURN id(u1) as source,id(u2) as target',
{write:true,graph:'cypher'})
YIELD loadMillis, computeMillis, writeMillis;
6. Syntax
CALL algo.scc(label:String, relationship:String,
{write:true,writeProperty:'partition',concurrency:4, graph:'huge'})
YIELD loadMillis, computeMillis, writeMillis, setCount, maxSetSize, minSetSize
Name  Type  Default  Optional  Description 

label 
string 
null 
yes 
The label to load from the graph. If null, load all nodes. 
relationship 
string 
null 
yes 
The relationship type to load from the graph. If null, load all relationships. 
write 
boolean 
true 
yes 
Specifies if the result should be written back as a node property. 
writeProperty 
string 
'partition' 
yes 
The property name written back to. 
concurrency 
int 
available CPUs 
yes 
The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency' and 'writeConcurrency'. 
readConcurrency 
int 
value of 'concurrency' 
yes 
The number of concurrent threads used for reading the graph. 
writeConcurrency 
int 
value of 'concurrency' 
yes 
The number of concurrent threads used for writing the result. 
graph 
string 
'huge' 
yes 
Use 'huge' when describing the subset of the graph with label and relationshiptype parameter. Use 'cypher' for describing the subset with cypher node statement and relationship statement. 
Name  Type  Description 

loadMillis 
int 
Milliseconds for loading data. 
computeMillis 
int 
Milliseconds for running the algorithm. 
writeMillis 
int 
Milliseconds for writing result data back. 
postProcessingMillis 
int 
Milliseconds for computing percentiles and community count. 
nodes 
int 
The number of nodes considered. 
communityCount 
int 
The number of communities found. 
p1 
double 
The 1 percentile of community size. 
p5 
double 
The 5 percentile of community size. 
p10 
double 
The 10 percentile of community size. 
p25 
double 
The 25 percentile of community size. 
p50 
double 
The 50 percentile of community size. 
p75 
double 
The 75 percentile of community size. 
p90 
double 
The 90 percentile of community size. 
p95 
double 
The 95 percentile of community size. 
p99 
double 
The 99 percentile of community size. 
p100 
double 
The 100 percentile of community size. 
write 
boolean 
Specifies if the result was written back as a node property. 
writeProperty 
string 
The property name written back to. 
CALL algo.scc.stream(label:String, relationship:String, {concurrency:4})
YIELD nodeId, partition
Name  Type  Default  Optional  Description 

label 
string 
null 
yes 
The label to load from the graph. If null, load all nodes. 
relationship 
string 
null 
yes 
The relationship type to load from the graph. If null, load all relationships. 
concurrency 
int 
available CPUs 
yes 
The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency'. 
readConcurrency 
int 
value of 'concurrency' 
yes 
The number of concurrent threads used for reading the graph. 
graph 
string 
'huge' 
yes 
Use 'huge' when describing the subset of the graph with label and relationshiptype parameter. Use 'cypher' for describing the subset with cypher node statement and relationship statement. 
Name  Type  Description 

nodeId 
int 
Node ID. 
partition 
int 
Partition ID. 
7. Implementations
algo.scc

Iterative adaptation (same as
algo.scc.iterative
).
algo.scc.recursive.tarjan

Original recursive tarjan implementation.
algo.scc.recursive.tunedTarjan

Also a recursive tarjan implementation.
algo.scc.iterative

Iterative adaption of tarjan algorithm.
algo.scc.multistep

Parallel SCC algorithm.
Was this page helpful?