7.11. Calculating the clustering coefficient of a network
In this example, adapted from
Niko Gamulins blog post on Neo4j for Social Network Analysis,
the graph in question is showing the 2hop relationships of a sample person as nodes with KNOWS
relationships.
The clustering coefficient of a selected node is defined as the probability that two randomly selected neighbors are connected to each other.
With the number of neighbors as n
and the number of mutual connections between the neighbors r
the calculation is:
The number of possible connections between two neighbors is n!/(2!(n2)!) = 4!/(2!(42)!) = 24/4 = 6
,
where n
is the number of neighbors n = 4
and the actual number r
of connections is 1
.
Therefore the clustering coefficient of node 1 is 1/6
.
n
and r
are quite simple to retrieve via the following query:
Query
MATCH (a { name: "startnode" })(b) WITH a, count(DISTINCT b) AS n MATCH (a)()[r]()(a) RETURN n, count(DISTINCT r) AS r
This returns n
and r
for the above calculations.
Result
n  r 

1 row  


Try this query live create (_0 {`name`:"startnode"}) create (_1) create (_2) create (_3) create (_4) create (_5) create (_6) create _0[:`KNOWS`]>_4 create _0[:`KNOWS`]>_3 create _0[:`KNOWS`]>_2 create _0[:`KNOWS`]>_1 create _1[:`KNOWS`]>_6 create _1[:`KNOWS`]>_5 create _2[:`KNOWS`]>_3 MATCH (a {name: "startnode"})(b) WITH a, count(distinct b) as n MATCH (a)()[r]()(a) RETURN n, count(distinct r) as r