Introduction
Neo4j Graph Analytics for Snowflake is in Public Preview and is not intended for production use. |
The Neo4j Graph Analytics library provides efficiently implemented, parallel versions of common graph algorithms, exposed as SQL functions.
Algorithms
Graph algorithms are used to compute metrics for graphs, nodes, or relationships.
They can provide insights on relevant entities in the graph (centralities, ranking), or inherent structures like communities (community-detection, graph-partitioning, clustering).
Many graph algorithms are iterative approaches that frequently traverse the graph for the computation using random walks, breadth-first or depth-first searches, or pattern matching.
Due to the exponential growth of possible paths with increasing distance, many of the approaches also have high algorithmic complexity.
Fortunately, optimized algorithms exist that utilize certain structures of the graph, memoize already explored parts, and parallelize operations. Whenever possible, we’ve applied these optimizations.
Neo4j Graph Analytics for Snowflake contains a large number of algorithms, which are detailed in the Operations reference.
Loading data and saving results
In order to run an algorithm, we first need to project data from your tables into an in-memory graph representation. And once your algorithm has finished computing, we write results back to tables you control, so you can see results and process further. We will explain the details later.
Algorithm traits
Algorithms in Neo4j Graph Analytics for Snowflake have specific ways to make use of various aspects of its input graph(s). We call these algorithm traits.
An algorithm trait can be:
-
supported: the algorithm leverages the trait and produces a well-defined results;
-
allowed: the algorithm does not leverage the trait but it still produces results;
-
unsupported: the algorithm does not leverage the trait and, given a graph with the trait, will return an error.
The following algorithm traits exist:
- Directed
-
The algorithm is well-defined on a directed graph.
- Undirected
-
The algorithm is well-defined on an undirected graph.
- Heterogeneous
-
The algorithm has the ability to distinguish between nodes and/or relationships of different types.
- Heterogeneous nodes
-
The algorithm has the ability to distinguish between nodes of different types.
- Heterogeneous relationships
-
The algorithm has the ability to distinguish between relationships of different types.
- Weighted relationships
-
The algorithm supports configuration to set relationship properties to use as weights. The algorithm will by default consider each relationship as equally important.