Release Date: 23 September 2021

GDS 1.7.0 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.6.5

Breaking changes

  • This release does not support Neo4j 4.0.x
  • Align returned modelInfo entry names of gds.alpha.ml.linkPrediction.train and gds.alpha.ml.nodeClassification.train with the model catalog. Now containing modelName and modelInfo instead of name and info.
  • Remove the sharedUpdater parameter from gds.alpha.ml.linkPrediction and gds.alpha.ml.nodeClassification.
  • gds.beta.graph.export.csv now exports into a subdirectory called export. Previously, the exported graphs were written directly into the configured directory.
  • Renamed all graphalgo packages to gds

New features

  • New Algorithm: Approximate Maximum K-Cut
    • Includes procedures: gds.alpha.maxkcut.[mutate|mutate.estimate|stream|stream.estimate].
  • Introduced Link Prediction Pipelines to make it easier to define and calculate features, split your graph, and make predictions.
    • Includes procedures: gds.alpha.ml.pipeline.linkPrediction.create|addNodeProperty|addFeature|configureSplit|configureParams|train|predict.mutate.
  • Introduced support for exporting additional node properties, including strings, from the underlying database.
    • Added additionalNodeProperties parameter to gds.graph.export
    • Added additionalNodeProperties parameter to gds.graph.export.csv
  • Introduced experimental support for querying the in-memory graph with Cypher
    • Added gds.alpha.create.cypherdb to allow neo4j to recognize the in-memory graph as a database for Cypher queries
  • To allow users better ability to handle multiple concurrent users, we’ve added a system monitoring procedure, gds.alpha.systemMonitor, to provide an overview of the system’s workload and available resources.
  • Progress logging is now turned on by default, and no longer requires changing your configuration settings. Progress can be accessed with gds.beta.listProgress
  • GraphSAGE now supports deterministic results with the randomSeed configuration parameter to gds.beta.graphSage.train.
  • Improve performance (up to 20x speedup) of weakly connected components, gds.wcc, for undirected graphs by applying a subgraph sampling optimization.

Bug fixes

  • Fixed a bug regarding weighted graphs with multiple relationship types, which affected gds.beta.graphSage and gds.alpha.spanningTree.
  • Supervised Machine Learning (Node Classification & Link Prediction):
    • Fixed a NaN issue in NodeClassification where computations with very small probability values can cause the result to flip to infinity.
    • Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
    • Corrected the training size used in gds.alpha.ml.linkPrediction.train. This affects the penality parameter used in logistic regression.
  • Progress Logging:
    • Fixed a bug in beta progress event tracking where progress events would not be released if computation was abandoned before completion.
    • Fixed a bug in beta progress event tracking for Pregel algorithms where progress events would not be released on algorithm completion.
  • Node Similarity & KNN:
    • Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
    • Fixed a bug which affected gds.nodeSimilarity.write and gds.alpha.knn.write when being executed in combination with a nodeLabels filter. The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids.
    • Fixed a bug where gds.nodeSimilarity.[write|mutate] and gds.beta.knn.[write|mutate] wrote duplicate relationships if the input graph is undirected.
  • KNN:
    • Fixed a bug in gds.beta.knn where negative values in node properties of type float arrays failed when returning the similarityDistribution.
  • Fast RP:
    • FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
  • GraphSAGE:
    • Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
    • Fixed a bug in gds.beta.graphSage, where the concurrency parameter was not considered.
  • Graph Operations:
    • Fixed a bug in gds.graph.removeNodeProperties where removedPropertiesWritten was too large for properties shared across multiple labels.
    • Fixed a bug in gds.beta.graph.generate, where random graphs with relationship properties could not be generated.
    • Fixed a bug in gds.create.subgraph which could lead to undefined behaviour or an AIOOB exception when executed on GDS Enterprise Edition.
    • Fixed a bug in gds.graph.create, where default values for array properties would throw for convertable types.

    Improvements

    • Pathfinding: Added existence checks for sourceNode and targetNode to all shortest path procedures in the product tier.
    • Improved runtime of gds.fastRP via better workload balancing between threads.
    • Lower memory footprint for LinkPrediction and NodeClassification.
    • Improved the procedure output of gds.beta.listProgress.
    • Scale down scores computed by gds.articleRank.