Introduction
This chapter provides a brief introduction of the main concepts in the Neo4j Graph Data Science library.
This library provides efficiently implemented, parallel versions of common graph algorithms for Neo4j, exposed as Cypher procedures.
1. Algorithms
Graph algorithms are used to compute metrics for graphs, nodes, or relationships.
They can provide insights on relevant entities in the graph (centralities, ranking), or inherent structures like communities (communitydetection, graphpartitioning, clustering).
Many graph algorithms are iterative approaches that frequently traverse the graph for the computation using random walks, breadthfirst or depthfirst searches, or pattern matching.
Due to the exponential growth of possible paths with increasing distance, many of the approaches also have high algorithmic complexity.
Fortunately, optimized algorithms exist that utilize certain structures of the graph, memoize already explored parts, and parallelize operations. Whenever possible, we’ve applied these optimizations.
The Neo4j Graph Data Science library contains a large number of algorithms, which are detailed in the Algorithms chapter.
1.1. Algorithm traits
Algorithms in GDS have specific ways to make use of various aspects of its input graph(s). We call these algorithm traits. When an algorithm supports an algorithm trait this indicates that the algorithm has been implemented to produce welldefined results in accordance with the trait. The following algorithm traits exist:
 Directed

The algorithm is welldefined on a directed graph.
 Undirected

The algorithm is welldefined on an undirected graph.
 Homogeneous

The algorithm will treat all nodes and relationships in its input graph(s) similarly, as if they were all of the same type. If multiple types of nodes or relationships exist in the graph, this must be taken into account when analysing the results of the algorithm.
 Heterogeneous

The algorithm has the ability to distinguish between nodes and/or relationships of different types.
 Weighted

The algorithm supports configuration to set node and/or relationship properties to use as weights. These values can represent cost, time, capacity or some other domainspecific property, specified via the nodeWeightProperty and relationshipWeightProperty configuration parameters. The algorithm will by default consider each node and/or relationship as equally important.
2. Graph Catalog
In order to run the algorithms as efficiently as possible, the Neo4j Graph Data Science library uses a specialized inmemory graph format to represent the graph data. It is therefore necessary to load the graph data from the Neo4j database into an in memory graph catalog. The amount of data loaded can be controlled by so called graph projections, which also allow, for example, filtering on node labels and relationship types, among other options.
For more information see Graph Management.
3. Editions
The Neo4j Graph Data Science library is available in two editions.

The open source Community Edition includes all algorithms and features, but is limited to four CPU cores.

The Neo4j Graph Data Science library Enterprise Edition:

Can run on an unlimited amount of CPU cores.

Supports the rolebased access control system (RBAC) from Neo4j Enterprise Edition.

Supports various additional model catalog features

Storing unlimited amounts of models in the model catalog


Supports an optimized inmemory graph implementation

For more information see System Requirements  CPU.
Was this page helpful?