Graph Algorithms in Neo4j: The Neo4j Graph Algorithms Library


The Neo4j Graph Algorithms Library is used on your connected data to gain new insights more easily within Neo4j. These graph algorithms improve results from your graph data, for example by focusing on particular communities or favoring popular entities.

This blog series is designed to help you better leverage graph analytics so you can effectively innovate and develop intelligent solutions faster.

Last week we took a look at various graph algorithm concepts, including two fundamental graph traversal algorithms: breadth-first search (BFS) and depth- first search (DFS).

This week we will take an in-depth look at the Neo4j Graph Algorithms Library. We developed this library as part of our effort to make it easier to use Neo4j for a wider variety of applications.

Check out the Neo4j graph algorithms library and learn how they are used.

The Neo4j Graph Algorithms Library: Why and How


We developed the graph algorithms library as part of our effort to make it easier to use the Neo4j graph database for a wider variety of applications. These algorithms have been tuned to be as efficient as possible in regards to resource utilization as well as streamlined for management and debugging.

Note: If you want to try out the examples in the rest of the blog series, you’ll need to first install the graph algorithms library. For detailed instructions, please download the free eBook at the end of this blog and follow the directions in Appendix B.

Neo4j graph algorithms are available as user-defined procedures called as part of Cypher statements running on top of Neo4j. Here is an architecture diagram.

Learn how Cypher and graph algorithms work together.

Usage


These algorithms are exposed as Neo4j procedures. They are called directly using Cypher in your Neo4j Browser, from Cypher Shell or from your client code. For most algorithms, there are two procedures:

    • algo. – This procedure writes results back to the graph as node-properties, and reports statistics.
    • algo..stream – This procedure returns a stream of data. For example, nodeids and computed values.
For large graphs, the streaming procedure might return millions, or even billions, of results. In this case, it may be more convenient to store the results of the algorithm, and then use them with later queries.

This is one of the use cases for a handy feature called graph projection.

Graph projection places a logical subgraph into a graph algorithm when your original graph has the wrong shape or granularity for that specific algorithm. For example, if you’re looking to understand the relationship between drug results for men versus women but your graph is not partitioned for this, you’ll be able to temporarily project a subgraph to quickly run your algorithm upon and move on to the next step.

We project the graph we want to run algorithms on with either label and relationship-type projection, or Cypher projection (shown below).

Learn about Cypher and graph algorithms in Neo4j.

The projected graph model is separate from Neo4j’s stored graph model to enable fast caching for the topology of the graph, containing only relevant nodes, relationships and weights. During projection of a directed subgraph, only one relationship directed in and one relationship directed out is allowed between a pair of nodes.

During the projection of an undirected subgraph, two relationships between a pair of nodes is allowed (there is no direction).

Label & Relationship-Type Projection


We project the subgraph we want to run the algorithm on by using the label parameter to describe nodes, and relationship-type to describe relationships.

The general call syntax is:

CALL algo.("NodeLabel", "RelationshipType", {config})

Cypher Projection


If label and relationship-type projection is not selective enough to describe our subgraph to run the algorithm on, we use Cypher statements to project subsets of our graph. Use a node-statement instead of the label parameter and a relationship-statement instead of the relationship-type, and use graph:"cypher" in the config.

Relationships described in the relationship-statement will only be projected if both source and target nodes are described in the node-statement. Relationships that don’t have both source and target nodes described in the node-statement will be ignored.

We also return a property value or weight (according to our config) in addition to the ids from these statements.

Cypher projection enables us to be more expressive in describing the subgraph that we want to analyze, but it might take longer to project the graph with more complex Cypher queries.

The general call syntax is:

CALL algo.(
"MATCH (n) RETURN id(n) AS id",
"MATCH (n)-->(m) RETURN id(n) AS source, id(m) AS target",
{graph: "cypher"})


Huge Graph Projection


The default label and relationship-type projection has a limitation of two billion nodes and two billion relationships, so if our projected graph is bigger than this, we need to use a huge graph projection. This is enabled by setting graph:"huge" in the config.

The general call syntax is:

CALL algo.("NodeLabel", "RelationshipType", {graph: "huge"})

Algorithm Types


For transactions and operational decisions, you need real-time graph analysis to provide a local view of relationships between specific data items and take action. To make discoveries about the overall nature of networks and model the behavior of complex systems, you need graph algorithms that provide a broader view of patterns and structures across all data and relationships.

The following table is helpful for working out the appropriate algorithm for your use case.

Check out this algorithm tables for use cases.

Conclusion


As we’ve shown, the Neo4j Graph Algorithms library is used on your connected data to gain new insights more easily within Neo4j. These graph algorithms improve results from your graph data, for example by focusing on particular communities or favoring popular entities.

Next week we will take an in-depth look at pathfinding and graph search algorithms, such as Shortest Path, Single Source Shortest Path, and All Pairs Shortest Path.


Find the patterns in your connected data
Learn about the power of graph algorithms in the O’Reilly book,
Graph Algorithms: Practical Examples in Apache Spark and Neo4j by the authors of this article. Click below to get your free ebook copy.


Get the O’Reilly Ebook