5.7. The Degree Centrality algorithm

This section describes the Degree Centrality algorithm in the Neo4j Graph Algorithms library.

Degree centrality measures the number of incoming and outgoing relationships from a node.

The Degree Centrality algorithm can help us find popular nodes in a graph.

This section includes:

5.7.1. History and explanation

Degree Centrality was proposed by Linton C. Freeman in his 1979 paper Centrality in Social Networks Conceptual Clarification. While the Degree Centrality algorithm can be used to find the popularity of individual nodes, it is often used as part of a global analysis where we calculate the minimum degree, maximum degree, mean degree, and standard deviation across the whole graph.

5.7.2. Use-cases - when to use the Degree Centrality algorithm

5.7.3. Degree Centrality algorithm sample

This sample will explain the Degree Centrality algorithm, using a simple graph:

Create sample graph. 

MERGE (nAlice:User {id:'Alice'})
MERGE (nBridget:User {id:'Bridget'})
MERGE (nCharles:User {id:'Charles'})
MERGE (nDoug:User {id:'Doug'})
MERGE (nMark:User {id:'Mark'})
MERGE (nMichael:User {id:'Michael'})

MERGE (nAlice)-[:FOLLOWS]->(nDoug)
MERGE (nAlice)-[:FOLLOWS]->(nBridget)
MERGE (nAlice)-[:FOLLOWS]->(nCharles)
MERGE (nMark)-[:FOLLOWS]->(nDoug)
MERGE (nMark)-[:FOLLOWS]->(nMichael)
MERGE (nBridget)-[:FOLLOWS]->(nDoug)
MERGE (nCharles)-[:FOLLOWS]->(nDoug)
MERGE (nMichael)-[:FOLLOWS]->(nDoug)

The following will run the algorithm and stream results, showing which users have the most followers: 

CALL algo.degree.stream("User", "FOLLOWS", {direction: "incoming"})
YIELD nodeId, score
RETURN algo.asNode(nodeId).id AS name, score AS followers
ORDER BY followers DESC

The following will run the algorithm and store results, showing which users have the most followers: 

CALL algo.degree("User", "FOLLOWS", {direction: "incoming", writeProperty: "followers"})

Table 5.38. Results
Name Followers

Doug

5.0

Bridget

1.0

Charles

1.0

Michael

1.0

Mark

0.0

Alice

0.0

The following will run the algorithm and stream results, showing which users follow the most other users: 

CALL algo.degree.stream("User", "FOLLOWS", {direction: "outgoing"})
YIELD nodeId, score
RETURN algo.asNode(nodeId).id AS name, score AS following
ORDER BY following DESC

The following will run the algorithm and store results, showing which users follow the most other users: 

CALL algo.degree("User", "FOLLOWS", {direction: "outgoing", writeProperty: "following"})

Table 5.39. Results
Name Following

Alice

3.0

Mark

2.0

Bridget

1.0

Charles

1.0

Michael

1.0

Doug

0.0

We can see that Doug is the most popular user in our imaginary Twitter graph, with 5 followers - all other users follow him, but he doesn’t follow anybody back. In the real Twitter network celebrities have very high follower counts but tend to follow very few back people. We could therefore consider Doug a celebrity!

5.7.4. Weighted Degree Centrality algorithm sample

This sample will explain the weighted Degree Centrality algorithm, using a simple graph:

The following will create a sample graph: 

MERGE (nAlice:User {id:'Alice'})
MERGE (nBridget:User {id:'Bridget'})
MERGE (nCharles:User {id:'Charles'})
MERGE (nDoug:User {id:'Doug'})
MERGE (nMark:User {id:'Mark'})
MERGE (nMichael:User {id:'Michael'})

MERGE (nAlice)-[:FOLLOWS {score: 1}]->(nDoug)
MERGE (nAlice)-[:FOLLOWS {score: 2}]->(nBridget)
MERGE (nAlice)-[:FOLLOWS {score: 5}]->(nCharles)
MERGE (nMark)-[:FOLLOWS {score: 1.5}]->(nDoug)
MERGE (nMark)-[:FOLLOWS {score: 4.5}]->(nMichael)
MERGE (nBridget)-[:FOLLOWS {score: 1.5}]->(nDoug)
MERGE (nCharles)-[:FOLLOWS {score: 2}]->(nDoug)
MERGE (nMichael)-[:FOLLOWS {score: 1.5}]->(nDoug)

This algorithm is a variant of the Degree Centrality algorithm, that measures the sum of the weights of incoming and outgoing relationships.

The following will run the algorithm and stream results, showing which users have the most weighted followers: 

CALL algo.degree.stream("User", "FOLLOWS", {direction: "incoming", weightProperty: "score"})
YIELD nodeId, score
RETURN algo.asNode(nodeId).id AS name, score AS weightedFollowers
ORDER BY followers DESC

The following will run the algorithm and store results, showing which users have the most weighted followers: 

CALL algo.degree("User", "FOLLOWS",
  {direction: "incoming", writeProperty: "weightedFollowers", weightProperty: "score"})

Table 5.40. Results
Name weightedFollowers

Doug

7.5

Charles

5.0

Michael

4.5

Bridget

2.0

Alice

0.0

Mark

0.0

Doug still remains our most popular user, but there isn’t such a big gap to the next person. Charles and Michael both only have one follower, but those relationships have a high relationship weight.

5.7.5. Huge graph projection

The default label and relationship-type projection has a limitation of 2 billion nodes and 2 billion relationships. Therefore, if our projected graph contains more than 2 billion nodes or relationships, we will need to use huge graph projection.

Set graph:'huge' in the config: 

CALL algo.degree('User','FOLLOWS', {graph:'huge'});

5.7.6. Cypher projection

If label and relationship-type are not selective enough to describe your subgraph to run the algorithm on, you can use Cypher statements to load or project subsets of your graph. This can also be used to run algorithms on a virtual graph. You can learn more in the Section 2.2, “Cypher projection” section of the manual.

Set graph:'cypher' in the config: 

CALL algo.degree(
  'MATCH (u:User) RETURN id(u) as id',
  'MATCH (u1:User)<-[:FOLLOWS]-(u2:User) RETURN id(u1) as source, id(u2) as target',
  {graph:'cypher', write: true, writeProperty: "followers"}
)

Note that the direction config parameter is ignored when using a Cypher projection for Degree Centrality. If we want to find the number of users that a user is following rather than their number of followers, we need to handle that in our Cypher query.

The following will run the algorithm and store the results, calculating the number of users that a user follows: 

CALL algo.degree(
  'MATCH (u:User) RETURN id(u) as id',
  'MATCH (u1:User)-[:FOLLOWS]->(u2:User) RETURN id(u1) as source, id(u2) as target',
  {graph:'cypher', write: true, writeProperty: "following"}
)

5.7.7. Syntax

The following will run the algorithm and write back results: 

CALL algo.degree(label:String, relationship:String,
    {write: true, writeProperty:'degree', concurrency:4})
YIELD nodes, loadMillis, computeMillis, writeMillis, write, writeProperty

Table 5.41. Parameters
Name Type Default Optional Description

label

string

null

yes

The label to load from the graph. If null, load all nodes.

relationship

string

null

yes

The relationship-type to load from the graph. If null, load all relationships.

direction

string

incoming

yes

The relationship direction to load from the graph. If 'both', treats the relationships as undirected.

iterations

int

20

yes

How many iterations of PageRank to run.

concurrency

int

available CPUs

yes

The number of concurrent threads.

dampingFactor

float

0.85

yes

The damping factor of the PageRank calculation.

weightProperty

string

null

yes

The property name that contains weight. If null, treats the graph as unweighted. Must be numeric.

defaultValue

float

0.0

yes

The default value of the weight in case it is missing or invalid.

write

boolean

true

yes

Specify if the result should be written back as a node property.

graph

string

'heavy'

yes

Use 'heavy' when describing the subset of the graph with label and relationship-type parameter. Use 'cypher' for describing the subset with cypher node-statement and relationship-statement.

Table 5.42. Results
Name Type Description

nodes

int

The number of nodes considered.

writeProperty

string

The property name written back to.

write

boolean

Specifies if the result was written back as node property.

loadMillis

int

Milliseconds for loading data.

computeMillis

int

Milliseconds for running the algorithm.

writeMillis

int

Milliseconds for writing result data back.

The following will run the algorithm and stream results: 

CALL algo.degree.stream(label:String, relationship:String,
    {concurrency:4})
YIELD node, score

Table 5.43. Parameters
Name Type Default Optional Description

label

string

null

yes

The label to load from the graph. If null, load all nodes.

relationship

string

null

yes

The relationship-type to load from the graph. If null, load all nodes.

direction

string

incoming

yes

The relationship direction to load from the graph. If 'both', treats the relationships as undirected.

concurrency

int

available CPUs

yes

The number of concurrent threads.

weightProperty

string

null

yes

The property name that contains weight. If null, treats the graph as unweighted. Must be numeric.

defaultValue

float

0.0

yes

The default value of the weight in case it is missing or invalid.

graph

string

'heavy'

yes

Use 'heavy' when describing the subset of the graph with label and relationship-type parameter. Use 'cypher' for describing the subset with cypher node-statement and relationship-statement.

Table 5.44. Results
Name Type Description

nodeId

long

Node ID

score

float

Degree Centrality score