Release Date: 9 February, 2021

GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

Breaking changes

  • Promoted several shortest path algorithms to beta tier: Dijkstra, A*, and Yens k-shortest paths. The APIs have been standardized, and all include the ability to return source/target nodes, nodes traversed, and paths.
    • This adds procedures
      • gds.beta.shortestPath.dijkstra.mutate
      • gds.beta.shortestPath.dijkstra.mutate.estimate
      • gds.beta.shortestPath.dijkstra.stream
      • gds.beta.shortestPath.dijkstra.stream.estimate
      • gds.beta.shortestPath.dijkstra.write
      • gds.beta.shortestPath.dijkstra.write.estimate
      • gds.beta.shortestPath.astar.mutate
      • gds.beta.shortestPath.astar.mutate.estimate
      • gds.beta.shortestPath.astar.stream
      • gds.beta.shortestPath.astar.stream.estimate
      • gds.beta.shortestPath.astar.write
      • gds.beta.shortestPath.astar.write.estimate
      • gds.beta.shortestPath.yens.mutate
      • gds.beta.shortestPath.yens.mutate.estimate
      • gds.beta.shortestPath.yens.stream
      • gds.beta.shortestPath.yens.stream.estimate
      • gds.beta.shortestPath.yens.write
      • gds.beta.shortestPath.yens.write.estimate
      • gds.beta.allShortestPaths.dijkstra.mutate
      • gds.beta.allShortestPaths.dijkstra.mutate.estimate
      • gds.beta.allShortestPaths.dijkstra.stream
      • gds.beta.allShortestPaths.dijkstra.stream.estimate
      • gds.beta.allShortestPaths.dijkstra.write
      • gds.beta.allShortestPaths.dijkstra.write.estimate
    • And removes alpha procedures
      • gds.alpha.shortestPath.stream
      • gds.alpha.shortestPath.write
      • gds.alpha.shortestPath.astar.stream
      • gds.alpha.kShortestPaths.stream
      • gds.alpha.kShortestPaths.write
      • gds.alpha.shortestPaths.stream
      • gds.alpha.shortestPaths.write
  • GDS will now throw an error when a user tries to use a mutate procedure on graphs not stored in the graph catalog (anonymous graphs)

New Features

  • Introduced machine learning based multi-class node classification procedures:
    • Add gds.alpha.ml.nodeClassification.train to train a model to predict a node label
    • Add gds.alpha.ml.nodeClassification.predict.mutate to make predictions using a trained model
  • Introduced machine learning based link prediction procedures:
    • Add gds.alpha.linkPrediction.train procedure for training Link Prediction models.
    • Added gds.alpha.linkPrediction.predict.mutate procedure for predicting relationships based on a trained Link Prediction model.
  • Added support for list properties as features for
    • gds.alpha.nodeClassification
    • gds.beta.fastRPExtended
    • gds.beta.graphSage
  • Added support for storing trained models on disk (Enterprise only)
    • gds.alpha.model.store
    • gds.alpha.model.load
    • gds.alpha.model.delete
  • Added procedure for publishing trained models (Enterprise only)
    • gds.alpha.model.publish
  • Added HITS algorithm to the alpha tier
    • gds.alpha.hits.mutate and gds.alpha.hits.mutate.estimate
    • gds.alpha.hits.stats and gds.alpha.hits.stats.estimate
    • gds.alpha.hits.stream and gds.alpha.hits.stream.estimate
    • gds.alpha.hits.write and gds.alpha.hits.write.estimate
  • Added Speaker-Listener Label Propagation Algorithm (SLLPA) to the alpha tier
    • gds.alpha.sllpa.mutate and gds.alpha.sllpa.mutate.estimate
    • gds.alpha.sllpa.stats and gds.alpha.sllpa.stats.estimate
    • gds.alpha.sllpa.stream and gds.alpha.sllpa.stream.estimate
    • gds.alpha.sllpa.write and gds.alpha.sllpa.write.estimate
  • Added CSV export capabilities with the gds.beta.graph.export.csv procedure to allow users to export their in-memory graph to CSV
  • Add message reducer capability to Pregel framework to improve memory consumption and computation runtime.
  • Added a progress logging procedure with gds.beta.listProgress, to return status of running algorithms. This is turned off by default, but can be enabled with gds.progress_tracking_enabled in the config.
  • Add a new BitIdMap data structure to represent node id mappings (Enterprise only)
    • The data structure can lead to a significant reduction in required heap space for an in-memory graph.
    • The data structure is used for native graph projections and in some algorithms, e.g., Louvain.
    • The data structure is not used in Cypher projections.
    • The feature is enabled by default on GDS Enterprise Edition and can be disabled using the USE_BIT_ID_MAP feature toggle.

Bug fixes

  • Adding projection parameters as additional configuration in gds.graph.create and gds.graph.create.cypher will throw an exception if improperly configured, instead of being silently ignored.
  • Fixed a bug in gds.alpha.articleRank where centrality scores were not normalized correctly
  • Fixed a bug in path stream procedures where the path object (path: true) used incorrect node identifiers.
  • Fixed a bug in path write procedures where the relationship property nodeIds contained incorrect node identifiers.
  • Fixed a race condition that could cause exceptions thrown by scheduled tasks to be supressed.

Improvements

  • Improved progress logging to write progress per individual node label in gds.graph.writeNodeProperties.
  • When a named graph does not exist, the graph catalog will display similarly named stored graphs.
  • When a saved model does not exist, the model catalog will display similarly named stored graphs.
  • Added centralityDistribution to the return fields for the write mode of the alpha centrality algorithms.
  • gds.beta.graph.generate using relationshipDistribution: 'POWER_LAW' applies the distribution to the native orientation.
  • Added centralityDistribution as a return field in gds.betweenness.[write/mutate/stats]
  • Added getNeighbours and isMultiGraph to the Pregel-API.
  • Added new message queue implementations for the Pregel framework, which
    • replace the previously used JCTools queue and work with primitive double arrays instead of boxed values.
    • lead to 3x to 5x faster runtimes for Pregel based algorithms.
    • reduce GC pressure due to less object allocations which leads to more predictable runtimes.
    • support synchronous and asynchronous Pregel computations.

Other Changes

  • The PageRank configuration parameter cacheWeights has been deprecated. The parameter had no effect.
  • Deprecate minimumScore, maximumScore, scoreSum return fields in gds.betweenness.[write/mutate/stats]