This chapter provides a brief introduction of the main concepts in the Neo4j Graph Data Science library.
This library provides efficiently implemented, parallel versions of common graph algorithms for Neo4j, exposed as Cypher procedures.
Graph algorithms are used to compute metrics for graphs, nodes, or relationships.
They can provide insights on relevant entities in the graph (centralities, ranking), or inherent structures like communities (community-detection, graph-partitioning, clustering).
Many graph algorithms are iterative approaches that frequently traverse the graph for the computation using random walks, breadth-first or depth-first searches, or pattern matching.
Due to the exponential growth of possible paths with increasing distance, many of the approaches also have high algorithmic complexity.
Fortunately, optimized algorithms exist that utilize certain structures of the graph, memoize already explored parts, and parallelize operations. Whenever possible, we’ve applied these optimizations.
The Neo4j Graph Data Science library contains a large number of algorithms, which are detailed in the Algorithms chapter.
In order to run the algorithms as efficiently as possible, the Neo4j Graph Data Science library uses a specialized in-memory graph format to represent the graph data. It is therefore necessary to load the graph data from the Neo4j database into an in memory graph catalog. The amount of data loaded can be controlled by so called graph projections, which also allow, for example, filtering on node labels and relationship types, among other options.
For more information see Graph Management.
The Neo4j Graph Data Science library is available in two editions.
The Neo4j Graph Data Science library Enterprise Edition:
For more information see System Requirements - CPU.