Changelog

Neo4j Graph Analytics for Snowflake is in Public Preview and is not intended for production use.

This page contains a raw changelog of Neo4j Graph Analytics for Snowflake.

Changelog

0.3.14

Added

  • Added admin.show_jobs procedure to list all finished jobs in the system.

  • Added TriangleCounting algorithm and procedure graph.triangle_count.

Changed

  • admin.get_max_nodes replaces internal.get_max_nodes

  • admin.set_max_nodes replaces internal.set_max_nodes

  • admin.get_min_nodes replaces internal.get_min_nodes

  • admin.set_min_nodes replaces internal.set_min_nodes

  • graph.job_log replaces internal.job_service_log

Fixed

  • Diagnostic information that was lost with the shift to running transient job services is restored temporarily by changing the log level to DEBUG.

0.3.13

Added

Changed

Fixed

  • Fixed a problem in Dijkstra and Page Rank where result configuration entry could show internal node IDs.

  • Work around limitation in SPCS event sharing.

0.3.12

Added

  • Procedures internal.get_min_nodes, internal.get_max_nodes, internal.set_min_nodes, internal.set_max_nodes, to manage number of nodes in compute pools.

  • Log endpoint internal.job_service_log includes stack-trace when python based algorithms fail.

Changed

  • For graphsage and fastpath algorithms align api syntax, such as top-level keys and camelcased parameters, to be same as for all other algorithms.

Fixed

  • A bug in graph.gs_nc_train, graph.gs_nc_predict, graph.gs_unsup_train, graph.gs_unsup_predict, where GPU’s were not utilized.

0.3.11

Added

Changed

Fixed

0.3.10

Added

  • Added procedures: graph.betweenness graph.dijkstra graph.dijkstra_single_source graph.drop_model graph.fastpath graph.fast_rp graph.graph graph.gs_nc_predict graph.gs_nc_train graph.gs_unsup_predict graph.gs_unsup_train graph.knn graph.louvain graph.model_exists graph.node_similarity graph.page_rank graph.show_available_compute_pools graph.show_models graph.wcc

Changed

Fixed

0.3.9

Added

Changed

Fixed

  • Restore broken data on available compute pools.

0.3.8

Added

  • Support for GPU compute pool GPU_NV_XS, available in most Azure regions.

  • gml.show_available_compute_pools and gds.show_available_compute_pools These are replacements for the gml.list_available_compute_pools and gds.list_available_compute_pools procedures, which will be removed in a future release.

Changed

Fixed

  • Compute pool and warehouse creation no longer fails when a compute pool instance family is unavailable in a particular region.

0.3.7

Added

  • Support for defaultTablePrefix in gds.graph_project, enabling a common prefix for all tables in the projection.

  • Grant OPERATE on application-manged compute pools to APP_ADMIN role.

Changed

  • Replaced the map with a list of tables or views in nodeTables within gds.graph_project. The corresponding label is now inferred from the table name. This is a breaking change.

  • Removed type parameter from list entries of relationshipTables in gds.graph_project. The relationship type is now inferred from the table name.

Fixed

  • Fixed a bug, where write_relationships could potentially end up writing wrong node ids if multiple node tables are involved in the projection.

0.3.6

Added

Changed

Fixed

0.3.5

Added

  • Support for projecting heterogeneous graphs from multiple node and relationship tables.

    • This is a breaking change as the syntax changed for

    • gds.graph_project

    • gds.write_nodeproperties

    • gds.write_relationships

    • Algorithm configurations that include node references (e.g. path algorithms).

  • Support for table-unqiue, non-integer node identifiers in input tables.

    • We now support VARCHAR and BIGINT node identifiers.

    • Node identifiers only need to be unique within the table they are projected from.

Changed

Fixed

0.3.4

Added

  • Procedure gds.list_available_compute_pools to list compute pools available for use with GDS Sessions.

  • Procedure gml.list_available_compute_pools to list compute pools available for use with GML Sessions.

  • New machine learning algorithm FastPath gml.fastpath for computing path embeddings.

  • Added endpoints for managing models:

    • Check existence for a model: gml.model_exists

    • List models: gml.model_list

    • Drop a model: gml.model_drop

Changed

  • If an invalid compute pool selector is used, raise an exception with clear messaging and a list of valid compute pool selectors.

  • Telemetry event sharing changes.

    • Errors and warnings ⇒ Mandatory

    • Traces ⇒ Mandatory

    • Usage logs ⇒ Mandatory

    • Debug logs ⇒ Optional

    • Metrics ⇒ Optional

Fixed

0.3.3

Added

Changed

Fixed

  • A recent change in Snowflake requires GPU compute pool usage to be declared up front in the application manifest, or compute pool creation fails.

0.3.2

Added

Changed

  • Slim down return values from GraphSAGE endpoints

  • Improved logging for GraphSAGE

  • Fail early in any gml training algorithm (currently GraphSAGE) if model name already exists

  • Add failure reason to log table in case of failure for gml training and prediction algorithms

Fixed

  • Fixed bug leading to progress of more than 100% being logged for GraphSAGE.

0.3.1

Added

Changed

Fixed

  • Fix an issue where GraphSAGE can run out of shared memory.

  • Removed target_label from config for gml.gs_nc_predict because it was unused.

0.3.0

Added

  • graph_project now supports projecting node identifier columns as BIGINT or VARCHAR.

    • This allows for more flexible node identifier columns, e.g., when using UUIDs.

    • For BIGINT there will be a ~2x regression in projection runtime, which will be addressed in an upcoming release.

  • Graph machine learning runtime.

    • gml.create_session

    • gml.stop_session

    • gml.list

  • Supervised GraphSAGE

    • gml.gs_nc_train

    • gml.gs_nc_predict

  • Unsupervised GraphSAGE

    • gml.gs_unsup_train

    • gml.gs_unsup_predict

  • Support for GPU compute pool GPU_NV_S.

Changed

Fixed

0.2.19

Added

  • graph_list shows heap memory usage of the in-memory-graph.

  • Add support for compute pool type HIGHMEM_X64_L.

Changed

  • Projecting from an empty node table is no longer allowed and will return an error.

Fixed

  • Invalid function parameters now fail with a better error message and are not server errors anymore.

    • This fixes long-running queries that would eventually fail with a server error.

0.2.18

Added

  • Added support for gds.drop_nodeproperties to drop node properties from a graph.

Changed

  • Improved service logging.

    • Separated logging for server layer (snowgraph) and application layer (gds).

    • Added more detailed logging for endpoint execution.

    • Allow setting log level via internal.set_log_level(logger, level) function.

Fixed

0.2.17

Added

Changed

Fixed

  • Fixed a bug where graph drop might stall for a long time trying to drop a graph that doesn’t exist.

  • Disabled fail-early on write back when missing privilege to create table, because privilege check was flaky.

0.2.16

Added

  • Added support for the HITS algorithm via the command gds.hits.

  • Added support for gds.graph_filter to filter subgraphs based on node and relationship properties.

Changed

  • Concurrency now defaults to number of cores. Affects 'concurrency', 'readConcurrency' and 'writeConcurrency'.

Fixed

0.2.15

Added

Changed

Fixed

0.2.14

Added

Changed

Fixed

0.2.13

Added

  • Added support for the Speaker-Listener Label Propagation algorithm via the command gds.sllpa.

Changed

  • Application creates five own compute pools from which consumer selects one to run on.

  • Application creates own query warehouse, which consumer configures according to their requirements.

  • Application requires grants for CREATE COMPUTE POOL and CREATE WAREHOUSE privileges.

Fixed

  • Various documentation fixes.

0.2.12

Added

Changed

  • gds.indirect_exposure now computes exposure, hop, parent and root for each node.

    • This can be defined in the configuration using 'mutateProperties': { 'exposure': '<key>', 'hop': '<key>', 'parent': '<key>', 'root': '<key>' }.

    • The algorithm currently only supports max aggregation, the exposureReducer config has been removed.

Fixed

0.2.11

Added

Changed

Fixed

0.2.10

Added

  • gds.indirect_exposure allows specifying an exposureReducer function to aggregate the exposure of multiple neighbors.

    • The default exposureReducer function is SUM, possible values are SUM, and MAX.

Changed

Fixed

0.2.9

Added

  • Added gds.indirect_exposure algorithm for risk analysis.

  • Post upgrade, calling gds.create_session will explicitly drop and re-create the service.

Changed

Fixed

0.2.8

Added

  • Added support for node id ranges that use the full BIGINT range.

Fixed

  • Fixed sizing of JVM heap memory for GDS service.

0.2.7

Added

  • GDS gets the calling Snowflake user’s username

    • to project, list and drop graphs per user

    • to run algorithms on users own graphs

  • GDS gets the calling Snowflake user’s current role

    • to set admin privileges if the current role has the application role APP_ADMIN

  • Support semi-structured ARRAY type for node property projections. Element types can be BIGINT or DOUBLE.

  • gds.write_nodeproperties_to_table and gds.write_relationships_to_table

    • Both functions upload data to an app-internal stage and then copy the data into the specified consumer table.

  • gds.write_nodeproperties_to_stage and gds.write_relationships_to_stage

    • Both functions upload data to a consumer-defined stage for further processing.

  • gds.write_nodeproperties_to_table supports writing semi-structured ARRAY type

    • Element types can be BIGINT or DOUBLE

  • gds.graph_project supports setting an orientation for relationships

    • possible values are NATURAL (default), UNDIRECTED and REVERSED

Changed

  • Renamed to "Neo4j Graph Data Science" (and long form "Neo4j Graph Data Science \<version>" in text).

  • write_nodeproperties and write_relationships parameter outputTable changed to table

  • write_nodeproperties and write_relationships are now aliases

    • write_nodeproperties is an alias for write_nodeproperties_to_table

    • write_relationships is an alias for write_relationships_to_table

  • Automatic eviction of GDS operation results (graph project, algorithms):

    • Results can be accessed via the gds.result_list and gds.result functions.

    • When an operation finishes, the result is kept for 2 more hours before it gets evicted.

0.2.6

Changed

  • Use snowpark-sdk for schema operations.

0.2.5

Fixed

  • Made sure that relationship property shows up in in-memory graph.

  • write_relationships now correctly writes relationships to the table.

0.2.4

Changed

  • graph_project, write_nodeproperties, and write_relationships use snowflake-jdbc driver instead of snowpark-sdk.