This section describes the Triangle Count or Clustering Coefficient algorithm in the Neo4j Graph Algorithms library.

Triangle counting is a community detection graph algorithm that is used to determine the number of triangles passing through each node in the graph. A triangle is a set of three nodes, where each node has a relationship to all other nodes.

Triangle counting gained popularity in social network analysis, where it is used to detect communities and measure the cohesiveness of those communities. It can also be used to determine the stability of a graph, and is often used as part of the computation of network indices, such as the clustering coefficient.

There are two types of clustering coefficient:

- Local clustering coefficient
- The local clustering coefficient of a node is the likelihood that its neighbours are also connected. The computation of this score involves triangle counting.
- Global clustering coefficient
- The global clustering coefficient is the normalized sum of those local clustering coefficients.

The transitivity coefficient of a graph is sometimes used, which is three times the number of triangles divided by the number of triples in the graph. For more information, see "Finding, Counting and Listing all Triangles in Large Graphs, An Experimental Study".

- Triangle count and clustering coefficient have been shown to be useful as features for classifying a given website as spam, or non-spam, content. This is described in "Efficient Semi-streaming Algorithms for Local Triangle Counting in Massive Graphs".
- Clustering coefficient has been used to investigate the community structure of Facebook’s social graph, where they found dense neighbourhoods of users in an otherwise sparse global graph. Find this study in "The Anatomy of the Facebook Social Graph".
- Clustering coefficient has been proposed to help explore thematic structure of the web, and detect communities of pages with a common topic based on the reciprocal links between them. For more information, see Curvature of co-links uncovers hidden thematic layers in the World Wide Web.

The following will create a sample graph:

```
MERGE (alice:Person{id:"Alice"})
MERGE (michael:Person{id:"Michael"})
MERGE (karin:Person{id:"Karin"})
MERGE (chris:Person{id:"Chris"})
MERGE (will:Person{id:"Will"})
MERGE (mark:Person{id:"Mark"})
MERGE (michael)-[:KNOWS]->(karin)
MERGE (michael)-[:KNOWS]->(chris)
MERGE (will)-[:KNOWS]->(michael)
MERGE (mark)-[:KNOWS]->(michael)
MERGE (mark)-[:KNOWS]->(will)
MERGE (alice)-[:KNOWS]->(michael)
MERGE (will)-[:KNOWS]->(chris)
MERGE (chris)-[:KNOWS]->(karin);
```

The following will return a stream of triples, with `nodeId`

for each triangle:

```
CALL algo.triangle.stream('Person','KNOWS')
YIELD nodeA,nodeB,nodeC
RETURN algo.getNodeById(nodeA).id AS nodeA, algo.getNodeById(nodeB).id AS nodeB, algo.getNodeById(nodeC).id AS nodeC
```

nodeA | nodeB | nodeC |
---|---|---|

Will |
Michael |
Chris |

Will |
Mark |
Michael |

Michael |
Karin |
Chris |

We can see that there are `KNOWS`

triangles containing "Will, Michael, and Chris", "Will, Mark, and Michael", and "Michael, Karin, and Chris".
This means that everybody in the triangle knows each other.

The following will count the number of triangles that a node is member of, and write it back. It will return the total triangle count and average clustering coefficient of the given graph:

```
CALL algo.triangleCount('Person', 'KNOWS',
{concurrency:4, write:true, writeProperty:'triangles',clusteringCoefficientProperty:'coefficient'})
YIELD loadMillis, computeMillis, writeMillis, nodeCount, triangleCount, averageClusteringCoefficient;
```

The following will count the number of triangles that a node is member of, and return a stream with `nodeId`

and `triangleCount`

:

```
CALL algo.triangleCount.stream('Person', 'KNOWS', {concurrency:4})
YIELD nodeId, triangles, coefficient
RETURN algo.getNodeById(nodeId).id AS name, triangles, coefficient
ORDER BY coefficient DESC
```

Name | Triangles | Coefficient |
---|---|---|

Karin |
1 |
1 |

Mark |
1 |
1 |

Chris |
2 |
0.6666666666666666 |

Will |
2 |
0.6666666666666666 |

Michael |
3 |
0.3 |

Alice |
0 |
0 |

We learn that Michael is part of the most triangles, but it’s Karin and Mark who are the best at introducing their friends - all of the people who know them, know each other!

In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes.

We check if this holds true for Yelp’s social network of friends:

```
CALL algo.triangleCount('User', 'FRIEND',
{concurrency:4, write:true, writeProperty:'triangles',clusteringCoefficientProperty:'coefficient'})
YIELD loadMillis, computeMillis, writeMillis, nodeCount, triangleCount, averageClusteringCoefficient;
```

Average clustering coefficient is 0.0523, which is really low for a social network. This indicates that groups of friends are not tightly knit together, but rather sparse. We can assume that users are not on Yelp for finding and creating friends, like Facebook for example, but rather something else, like finding good restaurant recommendations.

Local triangle count and clustering coefficient of nodes can be used as features in finding influencers in social networks.

The following will return a stream of triples with `nodeId`

for each triangle:

```
CALL algo.triangle.stream(label:String, relationship:String, {concurrency:4})
YIELD nodeA, nodeB, nodeC
```

Name | Type | Default | Optional | Description |
---|---|---|---|---|

label |
string |
null |
yes |
The label to load from the graph. If null, load all nodes |

relationship |
string |
null |
yes |
The relationship-type to load from the graph. If null, load all nodes |

concurrency |
int |
available CPUs |
yes |
The number of concurrent threads |

Name | Type | Description |
---|---|---|

nodeA |
int |
The ID of node in the given triangle |

nodeB |
int |
The ID of node in the given triangle |

nodeC |
int |
The ID of node in the given triangle |

The following will count the number of triangles that a node is a member of, and return a stream with `nodeId`

and `triangleCount`

:

```
CALL algo.triangleCount.stream(label:String, relationship:String, {concurrency:4})
YIELD nodeId, triangles
```

Name | Type | Default | Optional | Description |
---|---|---|---|---|

label |
string |
null |
yes |
The label to load from the graph. If null, load all nodes |

relationship |
string |
null |
yes |
The relationship-type to load from the graph. If null, load all relationships |

concurrency |
int |
available CPUs |
yes |
The number of concurrent threads |

Name | Type | Description |
---|---|---|

nodeId |
int |
The ID of node |

triangles |
int |
The number of triangles a node is member of |

The following will count the number of triangles that a node is a member of, and write it back. It will return the total triangle count and average clustering coefficient of the given graph:

```
CALL algo.triangleCount(label:String, relationship:String,
{concurrency:4, write:true, writeProperty:'triangles', clusteringCoefficientProperty:'coefficient'})
YIELD loadMillis, computeMillis, writeMillis, nodeCount, triangleCount, averageClusteringCoefficient
```

Name | Type | Default | Optional | Description |
---|---|---|---|---|

label |
string |
null |
yes |
The label to load from the graph. If null, load all nodes |

relationship |
string |
null |
yes |
The relationship-type to load from the graph. If null, load all relationships |

concurrency |
int |
available CPUs |
yes |
The number of concurrent threads |

write |
boolean |
true |
yes |
Specifies if the result should be written back as a node property |

writeProperty |
string |
'triangles' |
yes |
The property name the number of triangles a node is member of is written to |

clusteringCoefficientProperty |
string |
'coefficient' |
yes |
The property name clustering coefficient of the node is written to |

Name | Type | Description |
---|---|---|

loadMillis |
int |
Milliseconds for loading data |

computeMillis |
int |
Milliseconds for running the algorithm |

writeMillis |
int |
Milliseconds for writing result data back |

postProcessingMillis |
int |
Milliseconds for computing percentiles and community count |

nodeCount |
int |
The number of nodes considered |

triangleCount |
int |
The number of triangles found |

averageClusteringCoefficient |
float |
The average clustering coefficient of the given graph |

p1 |
double |
The 1 percentile of number of triangles. |

p5 |
double |
The 5 percentile of number of triangles. |

p10 |
double |
The 10 percentile of number of triangles. |

p25 |
double |
The 25 percentile of number of triangles. |

p50 |
double |
The 50 percentile of number of triangles. |

p75 |
double |
The 75 percentile of number of triangles. |

p90 |
double |
The 90 percentile of number of triangles. |

p95 |
double |
The 95 percentile of number of triangles. |

p99 |
double |
The 99 percentile of number of triangles. |

p100 |
double |
The 100 percentile of number of triangles. |

write |
boolean |
Specifies if the result was written back as a node property |

writeProperty |
string |
The property name the number of triangles a node is member of is written to |

clusteringCoefficientProperty |
string |
The property name clustering coefficient of the node is written to |

If label and relationship-type are not selective enough to describe your subgraph to run the algorithm on, you can use Cypher statements to load or project subsets of your graph. This can also be used to run algorithms on a virtual graph.

Set `graph:'cypher'`

in the config:

```
CALL algo.triangleCount(
'MATCH (p:Person) RETURN id(p) as id',
'MATCH (p1:Person)-[:KNOWS]->(p2:Person) RETURN id(p1) as source,id(p2) as target',
{concurrency:4, write:true, writeProperty:'triangle',graph:'cypher', clusteringCoefficientProperty:'coefficient'})
YIELD loadMillis, computeMillis, writeMillis, nodeCount, triangleCount, averageClusteringCoefficient
```

The Triangle Count algorithms support the following graph type:

- ✓ undirected, unweighted