Release Date: 20 May 2021

GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

Breaking changes

  • Degree centrality has been promoted to the product tier
    • Added procedures:
      • gds.degree.stream.estimate
      • gds.degree.write.estimate
      • gds.degree.mutate
      • gds.degree.mutate.estimate
      • gds.degree.stats
      • gds.degree.stats.estimate
    • Removed alpha procedures:
      • gds.alpha.degree.stream
      • Gds.alpha.degree.write
  • Article Rank has been promoted to the product tier
    • Added procedures:
      • gds.articleRank.stream
      • gds.articleRank.stream.estimate
      • gds.articleRank.write
      • gds.articleRank.write.estimate
      • gds.articleRank.mutate
      • gds.articleRank.mutate.estimate
      • gds.articleRank.stats
      • gds.articleRank.stats.estimate
    • Removed alpha procedures:
      • gds.alpha.articleRank.stream
      • gds.alpha.articleRank.write
  • Eigenvector Centrality has been promoted to the product tier
    • Added procedures:
      • gds.eigenvector.stream
      • gds.eigenvector.stream.estimate
      • gds.eigenvector.write
      • gds.eigenvector.write.estimate
      • gds.eigenvector.mutate
      • gds.eigenvector.mutate.estimate
      • gds.eigenvector.stats
      • gds.eigenvector.stats.estimate
    • Removed alpha procedures:
      • gds.alpha.eigenvector.stream
      • Gds.alpha.eigenvector.write
  • AStar has been promoted to the product tier
    • Added procedures:
      • gds.astar.stream
      • gds.astar.stream.estimate
      • gds.astar.write
      • gds.astar.write.estimate
      • gds.astar.mutate
      • gds.astar.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.astar.stream
      • gds.beta.astar.stream.estimate
      • gds.beta.astar.write
      • gds.beta.astar.write.estimate
      • gds.beta.astar.mutate
      • gds.beta.astar.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the YIELD.
  • Yens K Shortest Paths has been promoted to the product tier:
    • Added procedures:
      • gds.yens.stream
      • gds.yens.stream.estimate
      • gds.yens.write
      • gds.yens.write.estimate
      • gds.yens.mutate
      • gds.yens.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.yens.stream
      • gds.beta.yens.stream.estimate
      • gds.beta.yens.write
      • gds.beta.yens.write.estimate
      • gds.beta.yens.mutate
      • gds.beta.yens.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
  • Dijkstra Source-Target has been promoted to the product tier:
    • Added procedures:
      • gds.shortestPath.dijkstra.stream
      • gds.shortestPath.dijkstra.stream.estimate
      • gds.shortestPath.dijkstra.write
      • gds.shortestPath.dijkstra.write.estimate
      • gds.shortestPath.dijkstra.mutate
      • gds.shortestPath.dijkstra.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.shortestPath.dijkstra.stream
      • gds.beta.shortestPath.dijkstra.stream.estimate
      • gds.beta.shortestPath.dijkstra.write
      • gds.beta.shortestPath.dijkstra.write.estimate
      • gds.beta.shortestPath.dijkstra.mutate
      • gds.beta.shortestPath.dijkstra.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
  • Dijkstra Single-Source has been promoted to the product tier:
    • Added procedures:
      • gds.allShortestPath.dijkstra.stream
      • gds.allShortestPath.dijkstra.stream.estimate
      • gds.allShortestPath.dijkstra.write
      • gds.allShortestPath.dijkstra.write.estimate
      • gds.allShortestPath.dijkstra.mutate
      • gds.allShortestPath.dijkstra.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.allShortestPath.dijkstra.stream
      • gds.beta.allShortestPath.dijkstra.stream.estimate
      • gds.beta.allShortestPath.dijkstra.write
      • gds.beta.allShortestPath.dijkstra.write.estimate
      • gds.beta.allShortestPath.dijkstra.mutate
      • gds.beta.allShortestPath.dijkstra.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
  • Node2Vec has been promoted to the beta tier
    • Added procedures:
      • gds.beta.node2vec.stream
      • gds.beta.node2vec.stream.estimate
      • gds.beta.node2vec.write
      • gds.beta.node2vec.write.estimate
      • gds.beta.node2vec.mutate
      • gds.beta.node2vec.mutate.estimate
    • Removed alpha procedures:
      • gds.alpha.node2vec.stream
      • gds.alpha.node2vec.write
  • The parameter centerSamplingFactor is renamed to positiveSamplingFactor
  • The parameter contextSamplingExponent is renamed to negativeSamplingExponent
  • The model catalog list feature no longer throws an error when a non-existent model name is given
  • Node Classification and Link Prediction”
    • maxStreakCount configuration parameter is renamed to patience. It is used in the train modes of Node Classification and Link Prediction.
    • maxIterations and minIterations configuration parameters are renamed to maxEpochs and minEpochs. It is used in the train modes of Node Classification and Link Prediction.
    • windowSize configuration parameters is removed from the train modes of Node Classification and Link Prediction.
  • gds.alpha.ml.linkPrediction.train configuration parameter classRatio is renamed to negativeClassWeight. It is also mandatory now.
  • Removed degreeAsProperty configuration parameter from GraphSAGE
    • The same effect can be achieved by using gds.degree.mutate and use the mutated property as feature for GraphSAGE training.
    • Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.

    New features

    • New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
      • gds.alpha.scaleProperties.stream
      • gds.alpha.scaleProperties.mutate
    • Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure gds.beta.graph.create.subgraph
    • Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
      • gds.alpha.influenceMaximization.celf.stream
      • gds.alpha.influenceMaximization.greedy.stream
    • Link Prediction:
      • Added support for storing, loading and publishing Link Prediction models.
      • Added progress logging for gds.alpha.ml.linkPrediction.train and gds.alpha.ml.linkPrediction.predict.
      • Added write and stream modes to gds.alpha.ml.linkPrediction.predict
        • gds.alpha.ml.linkPrediction.stream
        • gds.alpha.ml.linkPrediction.write
      • Added estimate mode for Link Prediction:
        • gds.alpha.ml.linkPrediction.train.estimate
        • gds.alpha.ml.linkPrediction.stream.estimate
        • gds.alpha.ml.linkPrediction.mutate.estimate
        • gds.alpha.ml.linkPrediction.write.estimate
    • Node Classification:
      • Added support for storing, loading and publishing Node Classification models.
      • Added support for f1, precision, recall and accuracy per class for NodeClassification.
        • Used with metric(class=<number>) or syntactic sugar metric(class=*).
      • Added write and stream modes to gds.alpha.ml.nodeClassification.predict
        • gds.alpha.ml.nodeClassification.stream
        • gds.alpha.ml.nodeClassification.write
      • Added estimate mode for Node Classification:
        • gds.alpha.ml.nodeClassification.train.estimate
        • gds.alpha.ml.nodeClassification.mutate.estimate
        • gds.alpha.ml.nodeClassification.stream.estimate
        • gds.alpha.ml.nodeClassification.write.estimate
    • Added configuration parameter negativeSamplingWeight to gds.alpha.ml.splitRelationships
    • Pregel API:
      • Added progress logging to Pregel. This affects HITS, SLLPA, Article Rank, Eigenvector and Page Rank.
      • Added partioning configuration parameter to Pregel procedures, possible options are
        • range (default): each partition (thread) has the same number of nodes
        • degree: each partition (thread) has approximately the same number of relationships
    • Added randomSeed configuration parameter to gds.fastRP.* and gds.beta.fastRPExtended.*.
    • Node2Vec:
      • Added an optional parameter randomSeed to the Node2Vec procedures, which allows the generation of deterministic random walks.
      • Optimized implementation, decreasing execution time by up to 90%
      • Added support for weights with relationshipWeightProperty
      • Added mutate mode: gds.beta.node2vec.mutate
        • gds.alpha.ml.nodeClassification.mutate.estimate
        • gds.alpha.ml.nodeClassification.stream.estimate
        • gds.alpha.ml.nodeClassification.write.estimate
    • Added an optional db parameter to gds.graph.drop which allows dropping graphs from other neo4j databases in a multi-database environment.
    • Added scaler configuration parameter to gds.pagerank which allows normalizing the computed scores.
    • Added support for orientation in gds.degree procedures.
    • Users can now set a threshold for community sizes when writing back to the database:
      • Added minCommunitySize to gds.louvain.write to only write communities that are larger than or equal to a given value.
      • Added minComponentSize to gds.wcc.write to only write components that are larger than or equal to a given value.
    • RBAC Support for GDS Enterprise: Users that have been granted the admin role are allowed to operate on other users’ graphs. Those operations are:
      • View other users graphs in gds.graph.list
      • Use other users’ graphs in most algorithm procedures. Some alpha tier procedures and gds.graph.export do not support this feature
      • Drop other users graphs via gds.grap.drop
      • Naming conflicts (between users) can be resolved by using a username parameter in the configuration of the procedure

    Bug fixes

    • Fixed memory estimation for gds.graph.create.estimate on GDS Enterprise Edition, now estimates take into account enterprise compression format.
    • Fixed a bug which caused gds.graph.list and gds.graph.drop to throw an error when specifying a graph with duplicate property keys by failing early.
    • Fixed potential ArrayIndexOutOfBoundsException when running gds.triangleCount on a relationship-filtered graph.
    • Fixed a bug that can lead to inconsistencies when writing or mutating new relationships created from a label-filtered graph.
    • Support concurrency in gds.alpha.ml.linkPrediction.train and gds.alpha.ml.nodeClassification.train.
    • Fixed a bug where Alpha similarity algorithms in some cases could fail on division by 0 when writing results back.
    • Fixed an issue where gds.graph.drop could take a long time when the graph contained node embeddings.
    • Fixed a bug where loading array properties via cypher loading was not possible.
    • Fixed an issue where gds.fastRP.* and gds.beta.fastRPExtended.* were not failing on missing relationship properties when executing with weighted relationships.
    • Fixed a bug where gds.beta.graphSage.train was failing in the presence of array properties.
    • Fixed a bug in Node2Vec where the learning rate was scaled incorrectly.
    • Fixed a bug where the number of deleted node properties returned by gds.graph.removeNodeProperties would be wrong.
    • Fixed a bug where gds.alpha.scc would sometimes fail with an ArrayIndexOutOfBoundsException.
    • Fixed a security bug that allowed restricted users to run procedures that write back to the database.

  • Improvements

    • More algorithm logs are visible from the progress log procedure.
    • Mention database name in error message if graph is not found.
    • Removed a “disabled” log message from the database startup when GDS was running in its default configuration. It is replaced with a more elaborate “enabled” message when the progress tracking feature is enabled.
    • Use trapezoid rule in AUCPR metric for gds.alpha.ml.linkPrediction.train.
    • In-memory graphs are now removed from the catalog when the associated database is shut down or dropped.
    • Improved runtime and required memory of gds.pagerank, especially on high concurrency.
    • Improved the runtime and required memory of Node2Vec.
    • Community users can now store up to three models in the model catalog.
    • The sourceNode and targetNode configuration parameters in gds.astar, gds.allShortestPaths.dijkstra, gds.shortestPaths.dijkstra, gds.yens also accepts nodes as an alternative to only node-ids.
    • Add metrics to the modelInfo result column of gds.beta.graphSage.train

    Other changes

    • gds.graph.drop and gds.graph.list no longer return an internal-only field about memory usage.