Connecting with Python

Follow along with a notebook in Colab Google Colab

This tutorial shows how to interact with AuraDS using the Graph Data Science (GDS) client or the Python Driver. In the following sections you can switch between client and driver code clicking on the appropriate tab.

A running AuraDS instance must be available along with access credentials (generated in the Creating an AuraDS instance section) and its connection URI (found in the instance detail page, starting with neo4j+s://).

Installation

Both the GDS client and the Python driver can be installed using pip.

pip install graphdatascience

The latest stable version of the client can be found on PyPI.

pip install neo4j

The latest stable version of the driver can be found on PyPI.

If pip is not available, you can try replacing it with python -m pip or python3 -m pip.

Import and setup

Both the GDS client and the Python driver require the connection URI and the credentials as shown in the introduction.

The client is imported as the GraphDataScience class:

# Client import
from graphdatascience import GraphDataScience

The aura_ds=True constructor argument should be used to have the recommended non-default configuration settings of the Python Driver applied automatically.

# Replace with the actual URI, username, and password
AURA_CONNECTION_URI = "neo4j+s://xxxxxxxx.databases.neo4j.io"
AURA_USERNAME = "neo4j"
AURA_PASSWORD = "..."

# Client instantiation
gds = GraphDataScience(
    AURA_CONNECTION_URI,
    auth=(AURA_USERNAME, AURA_PASSWORD),
    aura_ds=True
)

The driver is imported as the GraphDatabase class:

# Driver import
from neo4j import GraphDatabase
# Replace with the actual URI, username and password
AURA_CONNECTION_URI = "neo4j+s://xxxxxxxx.databases.neo4j.io"
AURA_USERNAME = "neo4j"
AURA_PASSWORD = "..."

# Driver instantiation
driver = GraphDatabase.driver(
    AURA_CONNECTION_URI,
    auth=(AURA_USERNAME, AURA_PASSWORD)
)

Running a query

Once created, the client (or the driver) can be used to run Cypher queries and call Cypher procedures. In this example the gds.version procedure can be used to retrieve the version of GDS running on the instance.

# Call a GDS method directly
print(gds.version())
# Cypher query
gds_version_query = """
    RETURN gds.version() AS version
"""

# Create a driver session
with driver.session() as session:
    # Use .data() to access the results array
    results = session.run(gds_version_query).data()
    print(results)

The following code retrieves all the procedures available in the library and shows the details of five of them.

# Assign the result of the client call to a variable
results = gds.list()

# Print the result (a Pandas DataFrame)
print(results[:5])

Since the result is a Pandas DataFrame, you can use methods such as to_string and to_json to pretty-print it.

# Print the result (a Pandas DataFrame) as a console-friendly string
print(results[:5].to_string())
# Print the result (a Pandas DataFrame) as a prettified JSON string
print(results[:5].to_json(orient="table", indent=2))
# Import the json module for pretty visualization
import json

# Cypher query
list_all_gds_procedures_query = """
    CALL gds.list()
"""

# Create a driver session
with driver.session() as session:
    # Use .data() to access the results array
    results = session.run(list_all_gds_procedures_query).data()

    # Print the prettified result
    print(json.dumps(results[:5], indent=2))

Serializing Neo4j DateTime in JSON dumps

In some cases the result of a procedure call may contain Neo4j DateTime objects. In order to serialize such objects into JSON, a default handler must be provided.

# Import for the JSON helper function
from neo4j.time import DateTime

# Helper function for serializing Neo4j DateTime in JSON dumps
def default(o):
    if isinstance(o, (DateTime)):
        return o.isoformat()

# Run the graph generation algorithm
g, _ = gds.beta.graph.generate(
    "example-graph", 10, 3, relationshipDistribution="POWER_LAW"
)

# Drop the graph keeping the result of the operation, which contains
# some DateTime fields ("creationTime" and "modificationTime")
result = gds.graph.drop(g)

# Print the result as JSON, converting the DateTime fields with
# the handler defined above
print(result.to_json(indent=2, default_handler=default))
# Import to prettify results
import json

# Import for the JSON helper function
from neo4j.time import DateTime

# Helper function for serializing Neo4j DateTime in JSON dumps
def default(o):
    if isinstance(o, (DateTime)):
        return o.isoformat()

# Example query to run a graph generation algorithm
create_example_graph_query = """
    CALL gds.beta.graph.generate(
      'example-graph', 10, 3, {relationshipDistribution: 'POWER_LAW'}
    )
"""

# Example query to delete a graph
delete_example_graph_query = """
    CALL gds.graph.drop('example-graph')
"""

# Create the driver session
with driver.session() as session:
    # Run the graph generation algorithm
    session.run(create_example_graph_query).data()

    # Drop the generated graph keeping the result of the operation
    results = session.run(delete_example_graph_query).data()

    # Prettify the results using the handler defined above
    print(json.dumps(results, indent=2, sort_keys=True, default=default))

Closing the connection

The connection should always be closed when no longer needed.

Although the GDS client automatically closes the connection when the object is deleted, it is good practice to close it explicitly.

# Close the client connection
gds.close()
# Close the driver connection
driver.close()

References

Cypher