6.1. The Minimum Weight Spanning Tree algorithm

This section describes the Minimum Weight Spanning Tree algorithm in the Neo4j Graph Algorithms library.

The Minimum Weight Spanning Tree (MST) starts from a given node, and finds all its reachable nodes and the set of relationships that connect the nodes together with the minimum possible weight. Prim’s algorithm is one of the simplest and best-known minimum spanning tree algorithms. The K-Means variant of this algorithm can be used to detect clusters in the graph.

6.1.1. History and explanation

The first known algorithm for finding a minimum spanning tree was developed by the Czech scientist Otakar Borůvka in 1926, while trying to find an efficient electricity network for Moravia. Prim’s algorithm was invented by Jarnik in 1930 and rediscovered by Prim in 1957. It is similar to Dijkstra’s shortest path algorithm but, rather than minimizing the total length of a path ending at each relationship, it minimizes the length of each relationship individually. Unlike Dijkstra’s, Prim’s can tolerate negative-weight relationships.

The algorithm operates as follows:

  • Start with a tree containing only one node (and no relationships).
  • Select the minimal-weight relationship coming from that node, and add it to our tree.
  • Repeatedly choose a minimal-weight relationship that joins any node in the tree to one that is not in the tree, adding the new relationship and node to our tree.
  • When there are no more nodes to add, the tree we have built is a minimum spanning tree.

6.1.2. Use-cases - when to use the Minimum Weight Spanning Tree algorithm

6.1.3. Constraints - when not to use the Minimum Weight Spanning Tree algorithm

The MST algorithm only gives meaningful results when run on a graph, where the relationships have different weights. If the graph has no weights, or all relationships have the same weight, then any spanning tree is a minimum spanning tree.

6.1.4. Minimum Weight Spanning Tree algorithm sample

mst

The following will create a sample graph: 

MERGE (a:Place {id:"A"})
MERGE (b:Place {id:"B"})
MERGE (c:Place {id:"C"})
MERGE (d:Place {id:"D"})
MERGE (e:Place {id:"E"})
MERGE (f:Place {id:"F"})
MERGE (g:Place {id:"G"})

MERGE (d)-[:LINK {cost:4}]->(b)
MERGE (d)-[:LINK {cost:6}]->(e)
MERGE (b)-[:LINK {cost:1}]->(a)
MERGE (b)-[:LINK {cost:3}]->(c)
MERGE (a)-[:LINK {cost:2}]->(c)
MERGE (c)-[:LINK {cost:5}]->(e)
MERGE (f)-[:LINK {cost:1}]->(g);

Minimum weight spanning tree visits all nodes that are in the same connected component as the starting node, and returns a spanning tree of all nodes in the component where the total weight of the relationships is minimized.

The following will run the Minimum Weight Spanning Tree algorithm and write back results: 

MATCH (n:Place {id:"D"})
CALL algo.spanningTree.minimum('Place', 'LINK', 'cost', id(n),
  {write:true, writeProperty:"MINST"})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN loadMillis, computeMillis, writeMillis, effectiveNodeCount;

The following will query minimum spanning tree: 

MATCH path = (n:Place {id:"D"})-[:MINST*]-()
WITH relationships(path) AS rels
UNWIND rels AS rel
WITH DISTINCT rel AS rel
RETURN startNode(rel).id AS source, endNode(rel).id AS destination, rel.cost AS cost

Figure 6.1. Results
minst result

To find all pairs of nodes included in our minimum spanning tree, run the following query:

Table 6.1. Results
Source Destination Cost

D

B

4

B

A

1

A

C

2

C

E

5

The minimum spanning tree excludes the relationship with cost 6 from D to E, and the one with cost 3 from B to C. Nodes F and G aren’t included because they’re unreachable from D.

Maximum weighted tree spanning algorithm is similar to the minimum one, except that it returns a spanning tree of all nodes in the component where the total weight of the relationships is maximized.

The following will run the maximum weight spanning tree algorithm and write back results: 

MATCH (n:Place{id:"D"})
CALL algo.spanningTree.maximum('Place', 'LINK', 'cost', id(n),
  {write:true, writeProperty:"MAXST"})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN loadMillis,computeMillis, writeMillis, effectiveNodeCount;

maxst result

6.1.4.1. K-Spanning tree

Sometimes we want to limit the size of our spanning tree result, as we are only interested in finding a smaller tree within our graph that does not span across all nodes. K-Spanning tree algorithm returns a tree with k nodes and k − 1 relationships.

In our sample graph we have 5 nodes. When we ran MST above, we got a 5-minimum spanning tree returned, that covered all five nodes. By setting the k=3, we define that we want to get returned a 3-minimum spanning tree that covers 3 nodes and has 2 relationships.

The following will run the k-minimum spanning tree algorithm and write back results: 

MATCH (n:Place{id:"D"})
CALL algo.spanningTree.kmin('Place', 'LINK', 'cost',id(n), 3,
  {writeProperty:"kminst"})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN loadMillis,computeMillis,writeMillis, effectiveNodeCount;

Table 6.2. Results
Place Partition

A

1

B

1

C

1

D

3

E

4

Nodes A, B, and C are the result 3-minimum spanning tree of our graph.

The following will run the k-maximum spanning tree algorithm and write back results: 

MATCH (n:Place{id:"D"})
CALL algo.spanningTree.kmax('Place', 'LINK', 'cost', id(n), 3,
  {writeProperty:"kmaxst"})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN loadMillis,computeMillis,writeMillis, effectiveNodeCount;

Table 6.3. Results
Place Partition

A

0

B

1

C

3

D

3

E

3

Nodes C, D, and E are the result 3-maximum spanning tree of our graph.

When we run this algorithm on a bigger graph, we can use the following query to find nodes that belong to our k-spanning tree result:

Find nodes that belong to our k-spanning tree result: 

MATCH (n:Place)
WITH n.partition AS partition, count(*) as count
WHERE count = k
RETURN n

6.1.5. Syntax

The following will run the algorithm and write back results: 

CALL algo.spanningTree(label:String, relationshipType:String, weightProperty:String, startNodeId:int, {writeProperty:String})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount

Table 6.4. Parameters
Name Type Default Optional Description

label

String

null

no

The label to load from the graph. If null, load all nodes

relationshipType

String

null

no

The relationship-type to load from the graph. If null, load all nodes

weightProperty

string

null

no

The property name that contains weight. Must be numeric.

startNodeId

long

null

no

The start node ID

write

boolean

true

yes

Specify if the result should be written back as relationships

writeProperty

string

'mst'

yes

The relationship-type written back as result

Table 6.5. Results
Name Type Description

effectiveNodeCount

int

The number of visited nodes

loadMillis

int

Milliseconds for loading data

computeMillis

int

Milliseconds for running the algorithm

writeMillis

int

Milliseconds for writing result data back

The following will run the k-spanning tree algorithm and write back results: 

CALL algo.spanningTree.k*(label:String, relationshipType:String, weightProperty:String, startNodeId:int, k:int, {writeProperty:String})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount

Table 6.6. Parameters
Name Type Default Optional Description

label

String

null

no

The label to load from the graph. If null, load all nodes

relationshipType

String

null

no

The relationship type

weightProperty

string

null

no

The property name that contains weight. Must be numeric.

startNodeId

int

null

no

The start node ID

k

int

null

no

The result is a tree with k nodes and k − 1 relationships

write

boolean

true

yes

Specifies if the result should be written back as a node property

writeProperty

string

'mst'

yes

The relationship-type written back as result

Table 6.7. Results
Name Type Description

effectiveNodeCount

int

The number of visited nodes

loadMillis

int

Milliseconds for loading data

computeMillis

int

Milliseconds for running the algorithm

writeMillis

int

Milliseconds for writing result data back

6.1.6. Graph type support

The Minimum Weight Spanning Tree algorithm supports the following graph type:

  • ✓ undirected, weighted