Graph Algorithms in Neo4j: Minimum Weight Spanning Tree


Graph algorithms provide the means to understand, model and predict complicated dynamics such as the flow of resources or information, the pathways through which contagions or network failures spread, and the influences on and resiliency of groups.

This blog series is designed to help you better utilize graph analytics and graph algorithms so you can effectively innovate and develop intelligent solutions faster using a graph database.

Last week we continued our look at pathfinding and graph search algorithms, with the All Pairs Shortest Path algorithm, which calculates a shortest path forest (group) containing all shortest paths between all nodes in the graph.

Learn more about graph algorithms in Neo4.


This week we finish our look at pathfinding and graph search algorithms, with a focus on the Minimum Weight Spanning Tree algorithm, which calculates the paths along a connected tree structure with the smallest value (weight of the relationship such as cost, time or capacity) associated with visiting all nodes in the tree. We’ll show an example using Neo4j and provide links to other examples and use cases.

About Minimum Weight Spanning Tree


The Minimum Weight Spanning Tree starts from a given node, and finds all its reachable nodes and the set of relationships that connect the nodes together with the minimum possible weight. Prim’s algorithm is one of the simplest and best-known Minimum Weight Spanning Tree algorithms. The K-Means variant of this algorithm can be used to detect clusters in the graph.

The first known algorithm for finding a Minimum Weight Spanning Tree was developed by the Czech scientist Otakar Borůvka in 1926 while trying to design an efficient electricity network for Moravia. Prim’s algorithm was invented by Jarnik in 1930 and rediscovered by Prim in 1957. It is similar to Dijkstra’s Shortest Path algorithm, but rather than minimizing the total length of a path ending at each relationship, it minimizes the length of each relationship individually. Unlike Dijkstra’s, Prim’s tolerates negative-weight relationships.

NOTE: The algorithm operates as follows:
    • Start with a tree containing only one node (and no relationships).
    • Select the minimal-weight relationship coming from that node and add it to our tree.
    • Repeatedly choose a minimal-weight relationship that joins any node in the tree to one that is not in the tree, adding the new relationship and node to our tree.
    • When there are no more nodes to add, the tree we have built is a minimum spanning tree.

When Should I Use Minimum Weight Spanning Tree?


TIP: The Minimum Weight Spanning Tree algorithm only gives meaningful results when run on a graph where the relationships have different weights. If the graph has no weights, or all relationships have the same weight, then any spanning tree is a minimum spanning tree.

Minimum Weight Spanning Tree Example


Let’s see the Minimum Weight Spanning Tree algorithm in action. The following Cypher statement creates a graph containing places and links between them.

MERGE (a:Place {id:"A"})
MERGE (b:Place {id:"B"})
MERGE (c:Place {id:"C"})
MERGE (d:Place {id:"D"})
MERGE (e:Place {id:"E"})
MERGE (f:Place {id:"F"})
MERGE (g:Place {id:"G"})
MERGE (d)-[:LINK {cost:4}]->(b)
MERGE (d)-[:LINK {cost:6}]->(e)
MERGE (b)-[:LINK {cost:1}]->(a)
MERGE (b)-[:LINK {cost:3}]->(c)
MERGE (a)-[:LINK {cost:2}]->(c)
MERGE (c)-[:LINK {cost:5}]->(e)
MERGE (f)-[:LINK {cost:1}]->(g);


Graph depiction of Minimum Weight Spanning Tree.


We run the algorithm to find the Minimum Weight Spanning Tree starting from D by executing the following query.

MATCH (n:Place {id:"D"})
CALL algo.spanningTree.minimum("Place", "LINK", "cost", id(n),
{write:true, writeProperty:"MINST"})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN loadMillis, computeMillis, writeMillis, effectiveNodeCount;
This procedure creates MINST relationships representing the minimum spanning tree. We then run the following query to find all
pairs of nodes and the associated cost of the relationships between them.
MATCH path = (n:Place {id:"D"})-[:MINST*]-()
WITH relationships(path) AS rels
UNWIND rels AS rel
WITH DISTINCT rel AS rel
RETURN startNode(rel).id AS source, endNode(rel).id AS destination, rel.cost AS cost


Minimum Weight Spanning Tree


The Minimum Weight Spanning Tree excludes the relationship with cost 6 from D to E, and the one with cost 3 from B to C. Nodes F and G aren’t included because they’re unreachable from D. There are also variations of the algorithm that find the Maximum Weight Spanning Tree or K-Spanning Tree.

Conclusion


As we’ve shown, Minimum Weight Spanning Tree is widely used for network designs, and also has real-time applications with rolling optimizations such as processes in a chemical refinery or driving route corrections.

Next week we’ll begin our look at Centrality algorithms, with a focus on PageRank, which measures the transitive influence or connectivity of nodes.


Find the patterns in your connected data
Learn about the power of graph algorithms in the O’Reilly book,
Graph Algorithms: Practical Examples in Apache Spark and Neo4j by the authors of this article. Click below to get your free ebook copy.


Get the O’Reilly Ebook