Release Date: 11 May 2020
GDS 1.4.0 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
 License key configuration was renamed from
licenseFile
tolicense_file
for consistency with Bloom  Removed sparsity parameter from
gds.alpha.randomProjection.*
 Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization.  Renamed
embeddingSize
parameter toembeddingDimension
for fastRP, GraphSAGE and Node2Vec.  Renamed
projectedFeatureSize
toprojectedFeatureDimension
for GraphSAGE  Renamed
nodePropertyNames
has been renamed tofeatureProperties
ingds.beta.fastRPExtended
andgds.beta.graphSage.train
 Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization.  Default parameters for
gds.fastRP
have changed on the following configuration parameters:iterationWeights
now has default[0.0, 1.0, 1.0]
normalizeL2
has been removed and its effect is always applied
 Removed alpha procedures for GraphSage (replaced with
beta
tier, see New Features section)gds.alpha.graphSage.stream
gds.alpha.graphSage.write
 GraphSage no longer directly calculates embeddings, instead it has been split into
train
(to generate a named model) andwrite, mutate
, andstream
to apply the model predictions to your data.  Due to the creation of a
train
mode for graph sage, the following configuration parameters were moved:embeddingSize
– moved as configuration parameter ofgds.beta.graphSage.train
aggregator
– moved as configuration parameter ofgds.beta.graphSage.train
activationFunction
– moved as configuration parameter ofgds.beta.graphSage.train
sampleSizes
– moved as configuration parameter ofgds.beta.graphSage.train
nodePropertyNames
– moved as configuration parameter ofgds.beta.graphSage.train
tolerance
– moved as configuration parameter ofgds.beta.graphSage.train
learningRate
– moved as configuration parameter ofgds.beta.graphSage.train
epochs
– moved as configuration parameter ofgds.beta.graphSage.train
maxIterations
– moved as configuration parameter ofgds.beta.graphSage.train
searchDepth
– moved as configuration parameter ofgds.beta.graphSage.train
negativeSampleWeight
– moved as configuration parameter ofgds.beta.graphSage.train
degreeAsProperty
– moved as configuration parameter ofgds.beta.graphSage.train
gds.beta.graphSage.stream
procedure now requiresmodelName
configuration parameter.gds.beta.graphSage.write
procedure requiresmodelName
configuration parameter. Removed
startLoss
andepochLosses
from the result columns ofgds.beta.graphSage.write
.  Added the graph create config as a return field to the train procedure, affecting
gds.beta.graphSage.train
 Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings.  Removed configuration parameter
maxCost
fromgds.alpha.bfs/dfs
.  Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
 Removed
degreeDistribution
fromgds.graph.drop
return columns. gds.pageRank
now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting. Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.
New features
 Promote GraphSage to
beta
tier and added support for inductive models with thetrain
mode This adds procedures
gds.beta.graphSage.mutate
gds.beta.graphSage.mutate.estimate
gds.beta.graphSage.stream
gds.beta.graphSage.stream.estimate
gds.beta.graphSage.train
gds.beta.graphSage.train.estimate
gds.beta.graphSage.write
gds.beta.graphSage.write.estimate
 And removes alpha procedures
gds.alpha.graphSage.stream
gds.alpha.graphSage.write
 This adds procedures
 GraphSage supports relationship weights, driven by
relationshipWeightProperty
 GraphSage supports node labels via
projectedFeatureSize
 Introduced the model catalog to manage trained models, including:
gds.beta.model.exists
– a procedure to check if a model exists in the catalogGds.beta.model.list
– list all available modelsgds.beta.model.drop
– removes a model from the catalog
 The Random Projection algorithm has been promoted to the product tier and we have added:
gds.fastRP.stats
gds.fastRP.mutate
gds.fastRP.estimate
 Added procedures for
stats
andmutate
mode, as well as,estimates
for all modes.
 FastRP has been extended to support relationship weights and directions
 FastRP supports integer configuration for iteration weights.
 We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
gds.beta.fastRPExtended.mutate
gds.beta.fastRPExtended.stream
gds.beta.fastRPExtended.stats
gds.beta.fastRPExtended.write
gds.beta.fastRPExtended.mutate.estimate
gds.beta.fastRPExtended.stream.estimate
gds.beta.fastRPExtended.stats.estimate
gds.beta.fastRPExtended.write.estimate
 We’ve added the KNearest Neighbors (KNN) algorithm to the beta tier
gds.beta.knn.mutate
andgds.beta.knn.mutate.estimate
gds.beta.knn.stats
andgds.beta.knn.stats.estimate
gds.beta.knn.stream
andgds.beta.knn.stream.estimate
gds.beta.knn.write
andgds.beta.knn.write.estimate
 The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
 Pregel framework
 Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
 Pregel now supports long and double array node values.
 Add support for composite node state to allow complex data types on nodes.
 Reduced memory consumption.
 Improved memory estimation.
 Simplified message iteration in
compute
methods.  Split context into Init and ComputeContext and simplified API.
 Removed
K1ColoringExample
standalone project.  Added
pregelbootstrap
standalone project.  Added
pregelexamples
module.
 Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
 Added
density
property to the output of graph ingraph.list
.  Added a
failIfMissing
flag togds.graph.drop
Bug fixes
 Pregel:
 Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
 Fix cast exception when returning array node properties in generated Pregel procedures.
 Fixed a bug in a multisource BFS traversal strategy that could affect the following procedures:
gds.alpha.closeness
gds.alpha.closeness.harmonic
gds.alpha.allShortestPaths
 Fixed a bug in
gds.alpha.shortestPath.deltaStepping
where large relationship weights led to incorrect results  Weakly connected components:
 Fixed a bug in WCC where
componentCount
would be negative when the graph is empty.  Fixed a regression where WCC could run more slowly with increased concurrency.
 Fixed a bug in WCC where
 Fixed bugs in Louvain:

communityCount
is no longer negative when the graph is empty.  changes to
maxIterations
are no longer ignored.

 Fixed a bug in LabelPropagation where
communityCount
would be negative when the graph is empty.  Fixed a bug in KNN where it failed when run on graphs with filtered values
 Fixed bugs in
gds.graph.export:
 Previously, at most one relationship property per relationship type would be exported (now all are exported)
 Default array node properties (null) lead to an exception
 Graph loading:
 Fixed a bug where using node label projections including properties on large graphs and high concurrency could lead to loss of some properties.
 Fixed bug in graph creation which could cause an AIOOB exception during node loading.
 The
readConcurrency
config parameter can no longer be overwritten by theconcurrency
param when it is explicitly set in an implicit graph creation config
 Fixed a bug in memory estimation of large anonymous fictitious graphs.
 Fixed bug in
gds.alpha.dfs/bfs
, where the algorithm did not terminate for graphs containing loops.  Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings.  Fixed a bug in Node2Vec where many disconnected nodes would cause a StackOverflowError
 Fixed a bug in RandomProjection each iteration weight was multiplied all previous iteration weights.
 Similarity algorithms:
 Fixed a bug where Alpha Similarity algorithms would load a graph even though it was not needed
 Fixed a bug where similarity algorithms would not remove the placeholder graph if config validation fails on invalid user input.
 Fixed a bug where community statistic computation could overflow for large community ids.
 Fixed a bug where DegreeCentrality returned incorrect values when concurrency > 1.
 Fixed a bug where ClosenessCentrality was using a slightly incorrect formula for WassermanFaust algorithm.
 Fixed a bug that affected
gds.triangleCount()
andgds.alpha.triangles()
where not all triangles would be counted under certain conditions.  Parallel edges in a graph no longer lead to incorrect Local Clustering Coefficient and Triangle Count results.
 The
Long.MIN_VALUE
fallback property values will now be translated toDouble.NaN
if a double value is requested.  Fixed a bug where graphs with multiple labels would sometimes fail when converting property values.
Improvements
fastRPExtended
andgraphSage
now fail if node properties areDouble.NaN
gds.fastRP
now accepts integer iterationWeights If
graphSage.train
is run on a graph without relationships, GDS now fails gracefully with an appropriate error message  Added validation that properties used by GraphSage exist on graph
 Added validation for
embeddingSize
>=1  Added a failIfExists flag to graph creation to enable a user to specify that if a graph already exists, it should be overwritten without failing.
 Progress logging:
 We now log progress in equally spaced percentages. This is 0100% either in steps of 1, or in larger steps if there are fewer than 100 batches. For example, if there are 50 batches, completing one batch means 2% progress, so it would log in steps of 2.
 Decreased the logging frequency when running with a high concurrency.
 Added
postProcessingMillis
togds.localClusteringCoefficient
andgds.triangleCount
for modes:mutate
,write
,stats
 It is always zero for now, but this is a standard result column for these modes
 Parallelized computation of result statistics for the following community detection procedures:
gds.wcc.write
,gds.wcc.mutate
andgds.wcc.stats
gds.louvain.write
,gds.louvain.mutate
andgds.louvain.stats
gds.labelPropagation.write
,gds.labelPropagation.mutate
andgds.labelPropagation.stats
gds.beta.modularityOptimization.write
andgds.beta.modularityOptimization.mutate
gds.alpha.scc.write
 Add graph schema to the result columns of
gds.model.list
andgds.model.drop
 Validate property existence (e.g.
seedProperty
) when running algorithms on Cypher projections.  Elements in a Pregel composite schema may be set public/private in order to include or exclude them from generated procedure results
 Improved memory estimation for
*
node projections.  Added validation that properties used by GraphSage exist on graph
 Introduced parallel graph construction to improve performance of Louvain and Node Similarity
 Inmemory graphs in multidatabase:
 When inmemory graphs are created, they are now associated with the database in use during creation time to prevent errors when running in a multidatabase environment.
gds.graph.info()
returns the database name the graph has been created on. Named graphs can only be used on the database they have been created on.