GraphDataScience

class graphdatascience.GraphDataScience

Primary API class for the Neo4j Graph Data Science Python Client. Always bind this object to a variable called gds.

__init__(endpoint: str | Driver | QueryRunner, auth: tuple[str, str] | None = None, aura_ds: bool = False, database: str | None = None, arrow: str | bool = True, arrow_disable_server_verification: bool = True, arrow_tls_root_certs: bytes | None = None, bookmarks: Any | None = None, show_progress: bool = True)

Construct a new GraphDataScience object.

Parameters:
  • endpoint (Union[str, Driver, QueryRunner]) – The Neo4j endpoint to connect to. Most commonly, this is a Bolt connection URI.

  • auth (Optional[Tuple[str, str]], default None) – A username, password pair for database authentication.

  • aura_ds (bool, default False) – A flag that indicates that that the client is used to connect to a Neo4j AuraDS instance.

  • database (Optional[str], default None) – The Neo4j database to query against.

  • arrow (Union[str, bool], default True) –

    Arrow connection information. This is either a string or a bool.

    • If it is a string, it will be interpreted as a connection URL to a GDS Arrow Server.

    • If it is a bool:
      • True will make the client discover the connection URI to the GDS Arrow server via the Neo4j endpoint.

      • False will make the client use Bolt for all operations.

  • arrow_disable_server_verification (bool, default True) – A flag that overrides other TLS settings and disables server verification for TLS connections.

  • arrow_tls_root_certs (Optional[bytes], default None) – PEM-encoded certificates that are used for the connection to the GDS Arrow Flight server.

  • bookmarks (Optional[Any], default None) – The Neo4j bookmarks to require a certain state before the next query gets executed.

  • show_progress (bool, default True) – A flag to indicate whether to show progress bars for running procedures.

bookmarks() Any | None

Get the Neo4j bookmarks defining the currently required states for queries to execute

Return type:

The (possibly None) Neo4j bookmarks defining the currently required state

close() None

Close the GraphDataScience object and release any resources held by it.

database() str | None

Get the database which queries are run against.

Returns:

The name of the database.

driver_config() dict[str, Any]

Get the configuration used to create the underlying driver used to make queries to Neo4j.

Returns:

The configuration as a dictionary.

find_node_id(labels: list[str] = [], properties: dict[str, Any] = {}) int

Find the node id of a node with the given labels and properties.

Parameters:
  • labels – The labels of the node to find.

  • properties – The properties of the node to find.

Returns:

The node id of the node with the given labels and properties.

last_bookmarks() Any | None

Get the Neo4j bookmarks defining the state following the most recently called query

Return type:

The (possibly None) Neo4j bookmarks defining the state following the most recently called query

list() DataFrame

List all available GDS procedures.

Returns:

A DataFrame containing all available GDS procedures.

lp_pipe(name: str) LPTrainingPipeline

Create a Link Prediction training pipeline, with all default settings.

Parameters:

name (str) – The name to give the pipeline. Must be unique within the Pipeline Catalog.

Returns:

A new instance of a Link Prediction pipeline object.

nc_pipe(name: str) NCTrainingPipeline

Create a Node Classification training pipeline, with all default settings.

Parameters:

name (str) – The name to give the pipeline. Must be unique within the Pipeline Catalog.

Returns:

A new instance of a Node Classification pipeline object.

nr_pipe(name: str) NRTrainingPipeline

Create a Node Regression training pipeline, with all default settings.

Parameters:

name (str) – The name to give the pipeline. Must be unique within the Pipeline Catalog.

Returns:

A new instance of a Node Regression pipeline object.

run_cypher(query: str, params: dict[str, Any] | None = None, database: str | None = None) DataFrame

Run a Cypher query

Parameters:
  • query (str) – the Cypher query

  • params (Dict[str, Any]) – parameters to the query

  • database (str) – the database on which to run the query

  • Returns – The query result as a DataFrame

server_version() ServerVersion

Get the version of the GDS library.

Returns:

The version of the GDS library.

set_bookmarks(bookmarks: Any) None

Set Neo4j bookmarks to require a certain state before the next query gets executed

Parameters:

bookmarks (Bookmark(s)) – The Neo4j bookmarks defining the required state

set_database(database: str) None

Set the database which queries are run against.

Parameters:

database (str) – The name of the database to run queries against.

set_show_progress(show_progress: bool) None

Set whether to show progress for running procedures.

Parameters:

show_progress (bool) – Whether to show progress for procedures.

version() str

Get the version of the GDS library.

Returns:

The version of the GDS library.