6.2.3. ArticleRank

This section describes the ArticleRank algorithm in the Neo4j Graph Data Science library.

ArticleRank is a variant of the Page Rank algorithm, which measures the transitive influence or connectivity of nodes.

This algorithm is in the alpha tier. For more information on algorithm tiers, see Chapter 6, Algorithms.

This section includes:

6.2.3.1. History and explanation

Where ArticleRank differs to Page Rank is that Page Rank assumes that relationships from nodes that have a low out-degree are more important than relationships from nodes with a higher out-degree. ArticleRank weakens this assumption.

ArticleRank is defined in ArticleRank: a PageRank‐based alternative to numbers of citations for analysing citation networks as follows:

AR(A) = (1-d) + d (AR(T1)/(C(T1) + C(AVG)) + ... + AR(Tn)/(C(Tn) + C(AVG))

where,

  • we assume that a page A has pages T1 to Tn which point to it (i.e., are citations).
  • d is a damping factor which can be set between 0 and 1. It is usually set to 0.85.
  • C(A) is defined as the number of links going out of page A.
  • C(AVG) is defined as the average number of links going out of all pages.

6.2.3.2. ArticleRank algorithm sample

This sample will explain the ArticleRank algorithm, using a simple graph:

articlerank

The following will create a sample graph: 

MERGE (paper0:Paper {name:'Paper 0'})
MERGE (paper1:Paper {name:'Paper 1'})
MERGE (paper2:Paper {name:'Paper 2'})
MERGE (paper3:Paper {name:'Paper 3'})
MERGE (paper4:Paper {name:'Paper 4'})
MERGE (paper5:Paper {name:'Paper 5'})
MERGE (paper6:Paper {name:'Paper 6'})

MERGE (paper1)-[:CITES]->(paper0)

MERGE (paper2)-[:CITES]->(paper0)
MERGE (paper2)-[:CITES]->(paper1)

MERGE (paper3)-[:CITES]->(paper0)
MERGE (paper3)-[:CITES]->(paper1)
MERGE (paper3)-[:CITES]->(paper2)

MERGE (paper4)-[:CITES]->(paper0)
MERGE (paper4)-[:CITES]->(paper1)
MERGE (paper4)-[:CITES]->(paper2)
MERGE (paper4)-[:CITES]->(paper3)

MERGE (paper5)-[:CITES]->(paper1)
MERGE (paper5)-[:CITES]->(paper4)

MERGE (paper6)-[:CITES]->(paper1)
MERGE (paper6)-[:CITES]->(paper4)

The following will run the algorithm and stream results: 

CALL gds.alpha.articleRank.stream({
  nodeProjection: 'Paper',
  relationshipProjection: 'CITES',
  iterations: 20,
  dampingFactor: 0.85
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS page,score
ORDER BY score DESC

The following will run the algorithm and write back results: 

CALL gds.alpha.articleRank.write({
  nodeProjection: 'Paper',
  relationshipProjection: 'CITES',
  iterations:20, dampingFactor:0.85,
  writeProperty: "pagerank"
})
YIELD nodes, iterations, createMillis, computeMillis, writeMillis, dampingFactor, writeProperty

Table 6.54. Results
Name ArticleRank

Paper 0

0.34616300000000005

Paper 1

0.319422

Paper 4

0.213733

Paper 2

0.21089400000000003

Paper 3

0.18026850000000003

Paper 5

0.15000000000000002

Paper 6

0.15000000000000002

Paper 0 is the most important paper, but it’s only the 2nd most cited paper - Paper 1 has more citations. However, Paper 1 cites Paper 0, which lets us see that it’s not only the number of incoming links that is important, but also the importance of the papers behind those links. Papers 5 and 6 are not cited by any other papers, so their score doesn’t increase above the initial score of 1 - dampingFactor.

6.2.3.3. Syntax

The following will run the algorithm and write back results: 

CALL gds.alpha.articleRank.write(graphNameOrConfig: String|Map, configuration: Map)
YIELD nodes, iterations, createMillis, computeMillis, writeMillis, dampingFactor, writeProperty

Table 6.55. Parameters
Name Type Default Optional Description

label

string

null

yes

The label to load from the graph. If null, load all nodes.

relationship

string

null

yes

The relationship type to load from the graph. If null, load all relationships.

iterations

int

20

yes

How many iterations of Page Rank to run.

concurrency

int

4

yes

The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency' and 'writeConcurrency'.

readConcurrency

int

value of 'concurrency'

yes

The number of concurrent threads used for reading the graph.

writeConcurrency

int

value of 'concurrency'

yes

The number of concurrent threads used for writing the result.

dampingFactor

float

0.85

yes

The damping factor of the Page Rank calculation.

graph

string

'huge'

yes

Use 'huge' when describing the subset of the graph with label and relationship-type parameter. Use 'cypher' for describing the subset with cypher node statement and relationship statement.

Table 6.56. Results
Name Type Description

nodes

int

The number of nodes considered.

iterations

int

The number of iterations run.

dampingFactor

float

The damping factor used.

writeProperty

string

The property name written back to.

createMillis

int

Milliseconds for loading data.

computeMillis

int

Milliseconds for running the algorithm.

writeMillis

int

Milliseconds for writing result data back.

The following will run the algorithm and stream results: 

CALL gds.alpha.articleRank.stream(graphNameOrConfig: String|Map, configuration: Map)
YIELD node, score

Table 6.57. Parameters
Name Type Default Optional Description

label

string

null

yes

The label to load from the graph. If null, load all nodes.

relationship

string

null

yes

The relationship type to load from the graph. If null, load all nodes.

iterations

int

20

yes

Specify how many iterations of Page Rank to run.

concurrency

int

4

yes

The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency'.

readConcurrency

int

value of 'concurrency'

yes

The number of concurrent threads used for reading the graph.

writeConcurrency

int

value of 'concurrency'

yes

The number of concurrent threads used for writing the result.

dampingFactor

float

0.85

yes

The damping factor of the Page Rank calculation.

graph

string

'huge'

yes

Use 'huge' when describing the subset of the graph with label and relationship-type parameter. Use 'cypher' for describing the subset with cypher node statement and relationship statement.

Table 6.58. Results
Name Type Description

node

long

Node ID

score

float

Page Rank weight

6.2.3.4. Graph type support

The ArticleRank algorithm supports the following graph types:

  • ✓ directed, unweighted
  • ✓ undirected, unweighted