##### Release Date: 24 March 2022

GDS 2.0.0 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.6**Breaking changes**

- Moved BFS to product tier
`Gds.alpha.bfs => gds.bfs.stream`

- Added support for
`gds.bfs.stream.estimate`

- Removed configuration parameter
`relationshipWeightProperty`

. - Rename configuration parameter
`startNodeId`

to`sourceNode`

. - Rename YIELD field
`startNodeId`

to`sourceNode`

.

- Moved DFS to product tier
`Gds.alpha.bfs => gds.dfs.stream`

- Added support for
`gds.dfs.stream.estimate`

- Removed configuration parameter
`relationshipWeightProperty`

. - Rename configuration parameter
`startNodeId`

to`sourceNode`

. - Rename YIELD field
`startNodeId`

to`sourceNode`

.

- Moved KNN to product tier
`gds.beta.knn.mutate`

=>`gds.knn.mutate`

`gds.beta.knn.stats`

=>`gds.knn.stats`

`gds.beta.knn.stream`

=>`gds.knn.stream`

`gds.beta.knn.write`

=>`gds.knn.write`

- Removed ANN (superseded by KNN)
`nodeWeightProperty`

for KNN replaced by`nodeProperties`

, which accepts multiple properties.

- Similarity:
- Moved alpha similarity functions
`gds.alpha.similarity.cosine`

=>`gds.similarity.cosine`

`gds.alpha.similarity.euclidean`

=>`gds.similarity.euclidean`

`gds.alpha.similarity.euclideanDistance`

=>`gds.similarity.euclideanDistance`

`gds.alpha.similarity.jaccard`

=>`gds.similarity.jaccard`

`gds.alpha.similarity.overlap`

=>`gds.similarity.overlap`

`gds.alpha.similarity.pearson`

=>`gds.similarity.pearson`

- Pearson similarity function no longer accepts Lists of Maps, but computes over Lists of Numbers like the other similarity functions.
- Removed
`gds.alpha.similarity.asVector`

function.

- Removed alpha similarity
*procedures*(similarity metrics added as modes for KNN and Node Similarity).`gds.alpha.similarity.cosine`

`gds.alpha.similarity.euclidean`

`gds.alpha.similarity.overlap`

`gds.alpha.similarity.pearson`

`gds.alpha.ml.ann`

- Moved alpha similarity functions
- Moved delta stepping shortest path to product tier
`gds.alpha.shortestPath.deltaStepping =>`

`gds.allShortestPath.delta.[write,stream, mutate, estimate]`

- Moved Closeness Centrality to beta tier
`gds.alpha.closeness.stream => gds.beta.closeness.stream`

`gds.alpha.closeness.stats => gds.beta.closeness.stats`

`gds.alpha.closeness.write => gds.beta.closeness.write`

`gds.alpha.closeness.mutate => gds.beta.closeness.mutate`

- Removed return item
`nodes`

from`write`

and`mutate`

mode. - Renamed configuration parameter
`improved`

to`useWassermanFaust`

. - Renamed YIELD field
`centrality`

to`score`

in`stream`

mode.

- Moved link prediction pipeline procedures to
`beta`

tier:`gds.beta.pipeline.linkPrediction.addFeature`

`gds.beta.pipeline.linkPrediction.addNodeProperty`

`gds.beta.pipeline.linkPrediction.configureParams`

`gds.beta.pipeline.linkPrediction.configureSplit`

`gds.beta.pipeline.linkPrediction.create`

`gds.beta.pipeline.linkPrediction.predict.mutate`

`gds.beta.pipeline.linkPrediction.predict.mutate.estimate`

`gds.beta.pipeline.linkPrediction.predict.stream`

`gds.beta.pipeline.linkPrediction.predict.stream.estimate`

`gds.beta.pipeline.linkPrediction.train`

`gds.beta.pipeline.linkPrediction.train.estimate`

.

- Moved node classification pipeline procedures to
`beta`

tier:`gds.beta.pipeline.nodeClassification.selectFeatures`

`gds.beta.pipeline.nodeClassification.addNodeProperty`

`gds.beta.pipeline.nodeClassification.configureParams`

`gds.beta.pipeline.nodeClassification.configureSplit`

`gds.beta.pipeline.nodeClassification.create`

`gds.beta.pipeline.nodeClassification.predict.mutate`

`gds.beta.pipeline.nodeClassification.predict.mutate.estimate`

`gds.beta.pipeline.nodeClassification.predict.stream`

`gds.beta.pipeline.nodeClassification.predict.stream.estimate`

`gds.beta.pipeline.nodeClassification.predict.write`

`gds.beta.pipeline.nodeClassification.predict.write.estimate`

`gds.beta.pipeline.nodeClassification.train`

`gds.beta.pipeline.nodeClassification.train.estimate`

.

- Removed non-pipeline versions of Node Classification, including procedures:
`gds.alpha.ml.nodeClassification.predict.mutate`

`gds.alpha.ml.nodeClassification.predict.mutate.estimate`

`gds.alpha.ml.nodeClassification.predict.stream`

`gds.alpha.ml.nodeClassification.predict.stream.estimate`

`gds.alpha.ml.nodeClassification.predict.write`

`gds.alpha.ml.nodeClassification.predict.write.estimate`

`gds.alpha.ml.nodeClassification.train`

`gds.alpha.ml.nodeClassification.train.estimate`

- Removed non-pipeline versions of Link Prediction, including procedures:
`gds.alpha.ml.linkPrediction.predict.mutate`

`gds.alpha.ml.linkPrediction.predict.mutate.estimate`

`gds.alpha.ml.linkPrediction.predict.stream`

`gds.alpha.ml.linkPrediction.predict.stream.estimate`

`gds.alpha.ml.linkPrediction.predict.write`

`gds.alpha.ml.linkPrediction.predict.write.estimate`

`gds.alpha.ml.linkPrediction.train`

`gds.alpha.ml.linkPrediction.train.estimate`

- Additional changes to node classification & link predictions
- Removed
`batchSize`

parameter for Node Classification pipeline predict modes, because it is not useful. - The procedure resolution for the
`taskName`

parameter of`gds.alpha.ml.pipeline.linkPrediction.addNodeProperty`

and`gds.alpha.ml.pipeline.nodeClassification.addNodeProperty`

changed and now requires the inclusion of the tier, e.g.`'scaleProperties'`

must now be written as`'alpha.scaleProperties'`

. - Changed node classification and link prediction training pipelines management from the model catalog to the new pipeline catalog. Trained pipelines (which we refer to as models) are still managed in the model catalog.
- Replaced
`gds.beta.pipeline.[nodeClassification|linkPrediction].configureParams(pipelineName::String, parameterSpace::List of Map)`

by`gds.beta.pipeline.[nodeClassification|linkPrediction].addLogisticRegression(pipelineName::String, config::Map`

. This also removes the previous default model candidate. - Removed
`useBiasFeature`

parameter in`gds.beta.pipeline.linkPrediction.addLogisticRegression`

.

- Removed
- Graph Projection:
- gds.graph.create renamed gds.graph.project
- In
`gds.graph.project`

, defining the same node property for different labels with different`neoPropertyKeys`

is no longer allowed. - Inputs for comparison expressions in
`graph.project.subgraph`

must resolve to the same type, i.e.,`long`

or`double`

.

- Removed support for anonymous graph syntax from algorithm execution. Only explicit, named graphs are supported.
- Memory estimation is an exception to this.

- Changed the syntax of memory estimation. The graph name or graph create config always go into the first parameter, the algorithm config always into the second.
- Dropped
`Neo4j 4.2`

support - Removed
`USE_PRE_AGGREGATION`

feature toggle.

###
**New features**

- KNN graduated to product tier:
- Added a random walk sampler for initializing KNN based on the topology of the input graph. The configuration key
`initialSampler`

accepts either`UNIFORM`

or`RANDOM_WALK`

. - Added possibility to exclude pairs of nodes in the K-Nearest Neighbor algorithm that have a similarity below a given threshold defined with an optional configuration parameter
`similarityCutoff`

. - Added perturbation rate to KNN, to reduce the risk of some neighbors not being explored. Configured with
`perturbationRate`

as a value between 0 and 1. - Improved normalization of KNN metrics to make them consistent and usable in combination
- KNN supports multiple node properties via the
`nodeProperties`

key - Added metrics
`ranIterations`

,`didConverge`

and`nodePairsConsidered`

to the result of`gds.knn.[stats|mutate|write]`

. - KNN can compute similarity over multiple node properties, specified with the new
`nodeProperties`

parameter. - Added new similarity metrics to KNN, configured per property via the
`nodeProperties`

key.- Euclidean
- Overlap
- Pearson.

- Added a random walk sampler for initializing KNN based on the topology of the input graph. The configuration key
- Added similarity metric selection to Node Similarity configured with similarityMetric (supports Jaccard or Overlap)
- BFS & DFS graduated to product tier
- Added support for
`mutate`

mode with`gds.dfs.mutate,`

`gds.bfs.mutate`

- Added support for
`estimate`

mode to`gds.bfs.[stream|mutate]`

and`gds.dfs.[stream|mutate]`

procedures. - Added progress logging support

- Added support for
- Added a new parallel single-source shortest path algorithm to product-tier:
`gds.allShortestPaths.delta.stream`

`gds.allShortestPaths.delta.write.estimate`

`gds.allShortestPaths.delta.write`

`gds.allShortestPaths.delta.write.estimate`

`gds.allShortestPaths.delta.mutate`

`gds.allShortestPaths.delta.mutate.estimate`

.

- Closeness Centrality graduated to beta tier, added:
`gds.beta.closeness.mutate`

`gds.beta.closeness.stats`

- Node Classification:
- Models produced with
`gds.alpha.ml.pipeline.nodeClassification.train`

can now be stored (persisted) using`gds.alpha.model.store`

. - Added
`estimate`

mode to`gds.alpha.ml.pipeline.nodeClassification.[train|predict.stream|predict.mutate|predict.write]`

procedures. - Added
`modelSelectionStats`

to`gds.alpha.ml.pipeline.nodeClassification.train`

- Only save metrics for winning model inside modelInfo.

- Models produced with
- Link Prediction:
- Models produced with
`gds.alpha.ml.pipeline.linkPrediction.train`

can now be stored (persisted) using`gds.alpha.model.store`

. - Added
`estimate`

mode to`gds.alpha.ml.pipeline.linkPrediction.train`

procedure. - Added
`estimate`

mode to`gds.alpha.ml.pipeline.linkPrediction.[train|predict.stream|predict.mutate]`

procedures. - Added
`modelSelectionStats`

to`gds.alpha.ml.pipeline.linkPrediction.train`

- Only save metrics for winning model inside modelInfo.

- Models produced with
- Added support for Random Forest models in both Link Prediction and Node Classification pipelines with
`gds.alpha.pipeline.[linkPrediction|nodeClassification].addRandomForest`

- Added pipeline catalog procedures for managing training pipelines:
`gds.beta.pipeline.list`

`gds.beta.pipeline.exists`

`gds.beta.pipeline.drop`

.

- Added new way of projecting a graph using Cypher:
`gds.alpha.graph.project`

, which is an aggregation rather than a procedure. - Added surface for hints and warnings generated by executed tasks with the new
`gds.alpha.userLog`

logging procedure. - Support for write back from Neo4j Causal Cluster Read Replica instance (requires Enterprise GDS).
- Support for graph projections backup and restore with
`gds.alpha.backup`

and`gds.alpha.restore`

(requires Enterprise GDS)

###
**Bug fixes**

- Fixed a bug where Node2Vec would produce an AIOOBE on sufficiently large graphs.
- Fixed a bug where ForkJoin pools were not properly closed which could lead to OOMs using Pregel-based algorithms,e.g. Page Rank.
- GraphSAGE:
- Fixed a bug where
`gds.beta.graphSage`

would produce incorrect results for smaller graphs. - Fixed a bug where
`gds.beta.graphSage`

would produce incorrect results for the pool aggregator.

- Fixed a bug where
- Node Classification & Link Prediction pipelines:
- Fixed a bug where
`gds.alpha.ml.pipeline.nodeClassification.train`

would train a model under the wrong username and not be accessible for the actual user. - Fixed a bug where
`gds.alpha.ml.pipeline.nodeClassification.train`

and`gds.alpha.ml.pipeline.linkPrediction.train`

would skip applying a penalty to the weight of the last feature. - Fixed a bug where the trainConfig of persisted models would not be shown to the user.
- Fixed a bug where
`gds.alpha.ml.pipeline.nodeClassification.train`

would not scale penalty to train set size correctly.

- Fixed a bug where
- Fixed a bug in
`gds.beta.graph.create.subgraph`

where long values greater than`2^53`

were not properly handled during expression evaluation. - Triangle Count & Local Clustering Coefficient
- Fixed a bug where
`gds.triangleCount`

and`gds.localClusteringCoefficient`

might produce wrong results when using a`nodeLabels`

filter. - Fixed a bug where graph intersection used in Triangle Count and Local Clustering Coefficient would fail on union node filtered graphs.

- Fixed a bug where
- Fixed a bug where
`gds.alpha.closeness`

might produce incorrect results for directed graphs. - Fixed a bug where function
`gds.alpha.similarity.cosine`

and procedures`gds.alpha.similarity.cosine.[stats,stream,write]`

returned the absolute value of the cosine computation, instead of the cosine value itself. - Fixed a bug where cypher on gds would try to access node properties as relationship properties and vice versa.
- Fixed a bug where
`gds.graph.create.cypher`

would sometimes not display the root cause in case of an error. - Fixed a bug where concurrently computing degrees on a node filtered graph would produce an AIOOBE.
- Fixed a bug where the memory estimation for generated Pregel procedures was calculated incorrectly.

###
**Improvements**

- GraphSAGE:
- Improved runtime performance for
`gds.beta.graphSage`

when using the`relationshipWeight`

configuration parameter. - Improve memory usage of
`gds.beta.graphSage`

by computing the features per batch lazily.

- Improved runtime performance for
- Memory estimation for
`gds.graph.project`

returns the estimated peak memory consumption during loading instead of the estimated final graph size. - Reduced memory consumption while loading using Native or Cypher projections.
`gds.alpha.ml.pipeline.[nodeClassification|linkPrediction].train`

will raise an error when either of train, test, or validation sets are empty.- Added
`failIfMissing`

flag to`gds.beta.[pipeline|model].drop`

. - Implemented batched prediction for LinkPrediction which improves runtime.
- Breadth First / Depth First Search:
- Parallel implementation of
`gds.bfs.stream`

. - Result field
`path`

of`gds.bfs.stream`

and`gds.dfs.stream`

will only be computed if explicitly specified in the`YIELD`

clause or there is no`YIELD`

clause.

- Parallel implementation of
- Provide more information to users if a node is missing a particular property in KNN.

### Recent Graph Data Science Releases

- Graph Data Science 2.3.3
- Graph Data Science 2.3.2
- Graph Data Science 2.3.1
- Graph Data Science 2.3.0
- Graph Data Science 2.2.7

*See All Graph Data Science Releases →*