Release Date: 27 May 2021
GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Degree centrality has been promoted to the product tier
- Added procedures:
gds.degree.stream.estimate
gds.degree.write.estimate
gds.degree.mutate
gds.degree.mutate.estimate
gds.degree.stats
gds.degree.stats.estimate
- Removed alpha procedures:
gds.alpha.degree.stream
Gds.alpha.degree.write
- Added procedures:
- Article Rank has been promoted to the product tier
- Added procedures:
gds.articleRank.stream
gds.articleRank.stream.estimate
gds.articleRank.write
gds.articleRank.write.estimate
gds.articleRank.mutate
gds.articleRank.mutate.estimate
gds.articleRank.stats
gds.articleRank.stats.estimate
- Removed alpha procedures:
gds.alpha.articleRank.stream
gds.alpha.articleRank.write
- Added procedures:
- Eigenvector Centrality has been promoted to the product tier
- Added procedures:
gds.eigenvector.stream
gds.eigenvector.stream.estimate
gds.eigenvector.write
gds.eigenvector.write.estimate
gds.eigenvector.mutate
gds.eigenvector.mutate.estimate
gds.eigenvector.stats
gds.eigenvector.stats.estimate
- Removed alpha procedures:
gds.alpha.eigenvector.stream
Gds.alpha.eigenvector.write
- Added procedures:
- AStar has been promoted to the product tier
- Added procedures:
gds.astar.stream
gds.astar.stream.estimate
gds.astar.write
gds.astar.write.estimate
gds.astar.mutate
gds.astar.mutate.estimate
- Removed alpha procedures:
gds.beta.astar.stream
gds.beta.astar.stream.estimate
gds.beta.astar.write
gds.beta.astar.write.estimate
gds.beta.astar.mutate
gds.beta.astar.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the YIELD.
- Added procedures:
- Yens K Shortest Paths has been promoted to the product tier:
- Added procedures:
gds.yens.stream
gds.yens.stream.estimate
gds.yens.write
gds.yens.write.estimate
gds.yens.mutate
gds.yens.mutate.estimate
- Removed alpha procedures:
gds.beta.yens.stream
gds.beta.yens.stream.estimate
gds.beta.yens.write
gds.beta.yens.write.estimate
gds.beta.yens.mutate
gds.beta.yens.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Source-Target has been promoted to the product tier:
- Added procedures:
gds.shortestPath.dijkstra.stream
gds.shortestPath.dijkstra.stream.estimate
gds.shortestPath.dijkstra.write
gds.shortestPath.dijkstra.write.estimate
gds.shortestPath.dijkstra.mutate
gds.shortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.shortestPath.dijkstra.stream
gds.beta.shortestPath.dijkstra.stream.estimate
gds.beta.shortestPath.dijkstra.write
gds.beta.shortestPath.dijkstra.write.estimate
gds.beta.shortestPath.dijkstra.mutate
gds.beta.shortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Single-Source has been promoted to the product tier:
- Added procedures:
gds.allShortestPath.dijkstra.stream
gds.allShortestPath.dijkstra.stream.estimate
gds.allShortestPath.dijkstra.write
gds.allShortestPath.dijkstra.write.estimate
gds.allShortestPath.dijkstra.mutate
gds.allShortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.allShortestPath.dijkstra.stream
gds.beta.allShortestPath.dijkstra.stream.estimate
gds.beta.allShortestPath.dijkstra.write
gds.beta.allShortestPath.dijkstra.write.estimate
gds.beta.allShortestPath.dijkstra.mutate
gds.beta.allShortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Node2Vec has been promoted to the beta tier
- Added procedures:
gds.beta.node2vec.stream
gds.beta.node2vec.stream.estimate
gds.beta.node2vec.write
gds.beta.node2vec.write.estimate
gds.beta.node2vec.mutate
gds.beta.node2vec.mutate.estimate
- Removed alpha procedures:
gds.alpha.node2vec.stream
gds.alpha.node2vec.write
- Added procedures:
- The parameter
centerSamplingFactor
is renamed topositiveSamplingFactor
- The parameter
contextSamplingExponent
is renamed tonegativeSamplingExponent
maxStreakCount
configuration parameter is renamed topatience
. It is used in the train modes of Node Classification and Link Prediction.maxIterations
andminIterations
configuration parameters are renamed tomaxEpochs
andminEpochs
. It is used in the train modes of Node Classification and Link Prediction.windowSize
configuration parameters is removed from the train modes of Node Classification and Link Prediction.
gds.alpha.ml.linkPrediction.train
configuration parameter classRatio
is renamed to negativeClassWeight
. It is also mandatory now.
degreeAsProperty
configuration parameter from GraphSAGE
- The same effect can be achieved by using
gds.degree.mutate
and use the mutated property as feature for GraphSAGE training. - Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.
New features
- New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
gds.alpha.scaleProperties.stream
gds.alpha.scaleProperties.mutate
- Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure
gds.beta.graph.create.subgraph
- Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
gds.alpha.influenceMaximization.celf.stream
gds.alpha.influenceMaximization.greedy.stream
- Link Prediction:
- Added support for storing, loading and publishing Link Prediction models.
- Added progress logging for
gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.linkPrediction.predict
. - Added write and stream modes to
gds.alpha.ml.linkPrediction.predict
gds.alpha.ml.linkPrediction.stream
gds.alpha.ml.linkPrediction.write
- Added estimate mode for Link Prediction:
gds.alpha.ml.linkPrediction.train.estimate
gds.alpha.ml.linkPrediction.stream.estimate
gds.alpha.ml.linkPrediction.mutate.estimate
gds.alpha.ml.linkPrediction.write.estimate
- Node Classification:
- Added support for storing, loading and publishing Node Classification models.
- Added support for
f1
,precision
,recall
andaccuracy
per class for NodeClassification.- Used with
metric(class=<number>)
or syntactic sugarmetric(class=*)
.
- Used with
- Added write and stream modes to
gds.alpha.ml.nodeClassification.predict
gds.alpha.ml.nodeClassification.stream
gds.alpha.ml.nodeClassification.write
- Added estimate mode for Node Classification:
gds.alpha.ml.nodeClassification.train.estimate
gds.alpha.ml.nodeClassification.mutate.estimate
gds.alpha.ml.nodeClassification.stream.estimate
gds.alpha.ml.nodeClassification.write.estimate
- Added configuration parameter
negativeSamplingWeight
togds.alpha.ml.splitRelationships
- Pregel API:
- Added progress logging to Pregel. This affects HITS, SLLPA, Article Rank, Eigenvector and Page Rank.
- Added
partioning
configuration parameter to Pregel procedures, possible options arerange
(default): each partition (thread) has the same number of nodesdegree
: each partition (thread) has approximately the same number of relationships
- Added
randomSeed
configuration parameter togds.fastRP.*
andgds.beta.fastRPExtended.*
. - Node2Vec:
- Added an optional parameter
randomSeed
to the Node2Vec procedures, which allows the generation of deterministic random walks. - Optimized implementation, decreasing execution time by up to 90%
- Added support for weights with
relationshipWeightProperty
- Added mutate mode:
gds.beta.node2vec.mutate
gds.alpha.ml.nodeClassification.mutate.estimate
gds.alpha.ml.nodeClassification.stream.estimate
gds.alpha.ml.nodeClassification.write.estimate
- Added an optional parameter
- Added an optional
db
parameter togds.graph.drop
which allows dropping graphs from other neo4j databases in a multi-database environment. - Added
scaler
configuration parameter togds.pagerank
which allows normalizing the computed scores. - Added support for
orientation
ingds.degree
procedures. - Users can now set a threshold for community sizes when writing back to the database:
- Added
minCommunitySize
togds.louvain.write
to only write communities that are larger than or equal to a given value. - Added
minComponentSize
togds.wcc.write
to only write components that are larger than or equal to a given value.
- Added
- RBAC Support for GDS Enterprise: Users that have been granted the admin role are allowed to operate on other users’ graphs. Those operations are:
- View other users graphs in
gds.graph.list
- Use other users’ graphs in most algorithm procedures. Some alpha tier procedures and
gds.graph.export
do not support this feature - Drop other users graphs via
gds.grap.drop
- Naming conflicts (between users) can be resolved by using a
username
parameter in the configuration of the procedure
- View other users graphs in
Bug fixes
- Fixed memory estimation for
gds.graph.create.estimate
on GDS Enterprise Edition, now estimates take into account enterprise compression format. - Fixed a bug which caused
gds.graph.list
andgds.graph.drop
to throw an error when specifying a graph with duplicate property keys by failing early. - Fixed potential ArrayIndexOutOfBoundsException when running
gds.triangleCount
on a relationship-filtered graph. - Fixed a bug that can lead to inconsistencies when writing or mutating new relationships created from a label-filtered graph.
- Support concurrency in
gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.nodeClassification.train
. - Fixed a bug where Alpha similarity algorithms in some cases could fail on division by 0 when writing results back.
- Fixed an issue where
gds.graph.drop
could take a long time when the graph contained node embeddings. - Fixed a bug where loading array properties via cypher loading was not possible.
- Fixed an issue where
gds.fastRP.*
andgds.beta.fastRPExtended.*
were not failing on missing relationship properties when executing with weighted relationships. - Fixed a bug where
gds.beta.graphSage.train
was failing in the presence of array properties. - Fixed a bug in Node2Vec where the learning rate was scaled incorrectly.
- Fixed a bug where the number of deleted node properties returned by
gds.graph.removeNodeProperties
would be wrong. - Fixed a bug where
gds.alpha.scc
would sometimes fail with an ArrayIndexOutOfBoundsException. - Fixed a security bug that allowed restricted users to run procedures that write back to the database.
- Fixed a bug where
gds.nodeSimilarity
returned incorrect results when graphs contained multiple relationship types.
Improvements
- More algorithm logs are visible from the progress log procedure.
- Mention database name in error message if graph is not found.
- Removed a “disabled” log message from the database startup when GDS was running in its default configuration. It is replaced with a more elaborate “enabled” message when the progress tracking feature is enabled.
- Use trapezoid rule in AUCPR metric for
gds.alpha.ml.linkPrediction.train
. - In-memory graphs are now removed from the catalog when the associated database is shut down or dropped.
- Improved runtime and required memory of
gds.pagerank
, especially on high concurrency. - Improved the runtime and required memory of Node2Vec.
- Community users can now store up to three models in the model catalog.
- The
sourceNode
andtargetNode
configuration parameters ingds.astar
,gds.allShortestPaths.dijkstra
,gds.shortestPaths.dijkstra
,gds.yens
also accepts nodes as an alternative to only node-ids. - Add
metrics
to themodelInfo
result column ofgds.beta.graphSage.train
- Added existence checks for
sourceNode
andtargetNode
to all shortest path procedures in the product tier
Other changes
gds.graph.drop
andgds.graph.list
no longer return an internal-only field about memory usage.
Recent Graph Data Science Releases
- Graph Data Science 2.12
- Graph Data Science 2.11
- Graph Data Science 2.10.1
- Graph Data Science 2.9.0
- Graph Data Science 2.8.0