Cypher workflow

This section describes how to create units of work and provide a logical context for that work.

Overview

The Neo4j Drivers expose a Cypher Channel over which database work can be carried out (see the Cypher Manual for more information on the Cypher Query Language).

Work itself is organized into sessions, transactions and queries, as follows:

sessions queries transactions
Figure 1. Sessions, queries and transactions

Sessions are always bound to a single transaction context, which is typically an individual database.

Using the bookmark mechanism, sessions also provide a guarantee of correct transaction sequencing, even when transactions occur over multiple cluster members. This effect is known as causal chaining.

Sessions

Sessions are lightweight containers for causally chained sequences of transactions (see Operations Manual → Causal consistency). They essentially provide context for storing transaction sequencing information in the form of bookmarks.

When a transaction begins, the session in which it is contained acquires a connection from the driver connection pool. On commit (or rollback) of the transaction, the session releases that connection again. This means that it is only when a session is carrying out work that it occupies a connection resource. When idle, no such resource is in use.

Due to the sequencing guaranteed by a session, sessions may only host one transaction at a time. For parallel execution, multiple sessions should be used. In languages where thread safety is an issue, sessions should not be considered thread-safe.

Closing a session forces any open transaction to be rolled back, and its associated connection to consequently be released back into the pool.

Sessions are bound to a single transactional context, specified on construction. Neo4j exposes each database inside its own context, thereby prohibiting cross-database transactions (or sessions) by design. Similarly, sessions bound to different databases may not be causally chained by propagating bookmarks between them.

Individual language drivers provide several session classes, each oriented around a particular programming style. Each session class provides a similar set of functionality but offers client applications a choice based on how the application is structured and what frameworks are in use, if any.

The session classes are described in The session API. For more details, please see the API documentation.

Transactions

Transactions are atomic units of work containing one or more Cypher Queries. Transactions may contain read or write work, and will generally be routed to an appropriate server for execution, where they will be carried out in their entirety. In case of a transaction failure, the transaction needs to be retried from the beginning. This is the responsibility of the transaction manager.

The Neo4j Drivers provide transaction management via the transaction function mechanism. This mechanism is exposed through methods on the Session object which accept a function object that can be played multiple times against different servers until it either succeeds or a timeout is reached. This approach is recommended for most client applications.

A convenient short-form alternative is the auto-commit transaction mechanism. This provides a limited form of transaction management for single-query transactions, as a trade-off for a slightly smaller code overhead. This form of transaction is useful for quick scripts and environments where high availability guarantees are not required. It is also the required form of transaction for running Cypher Manual → CALL {} IN TRANSACTIONS queries, which are the only type of Cypher Query to manage their own transactions.

A lower-level unmanaged transaction API is also available for advanced use cases. This is useful when an alternative transaction management layer is applied by the client, in which error handling and retries need to be managed in a custom way.

To learn more about how to use unmanaged transactions, see API documentation.

Queries and results

Queries consist of a request to the server to execute a Cypher statement, followed by a response back to the client with the result. Results are transmitted as a stream of records, along with header and footer metadata, and can be incrementally consumed by a client application. With reactive capabilities, the semantics of the record stream can be enhanced by allowing a Cypher result to be paused or cancelled part-way through.

To execute a Cypher Query, the query text is required along with an optional set of named parameters. The text can contain parameter placeholders that are substituted with the corresponding values at runtime. While it is possible to run non-parameterized Cypher Queries, good programming practice is to use parameters in Cypher Queries whenever possible. This allows for the caching of queries within the Cypher Engine, which is beneficial for performance. Parameter values should adhere to Cypher values.

A result summary is also generally available. This contains additional information relating to the query execution and the result content. For an EXPLAIN or PROFILE query, this is where the query plan is returned. See Cypher Manual → Profiling a query for more information on these queries.

Causal chaining and bookmarks

When working with a Causal Cluster, transactions can be chained, via a session, to ensure causal consistency. This means that for any two transactions, it is guaranteed that the second transaction will begin only after the first has been successfully committed. This is true even if the transactions are carried out on different physical cluster members. For more information on Causal Clusters, please refer to Operations Manual → Clustering.

Internally, causal chaining is carried out by passing bookmarks between transactions. Each bookmark records one or more points in transactional history for a particular database, and can be used to inform cluster members to carry out units of work in a particular sequence. On receipt of a bookmark, the server will block until it has caught up with the relevant transactional point in time.

An initial bookmark is sent from client to server on beginning a new transaction, and a final bookmark is returned on successful completion. Note that this applies to both read and write transactions.

Bookmark propagation is carried out automatically within sessions and does not require any explicit signal or setting from the application. To opt out of this mechanism, for unrelated units of work, applications can use multiple sessions. This avoids the small latency overhead of the causal chain.

Bookmarks can be passed between sessions by extracting the last bookmark from a session and passing this into the construction of another. Multiple bookmarks can also be combined if a transaction has more than one logical predecessor. Note that it is only when chaining across sessions that an application will need to work with bookmarks directly.

driver passing bookmarks
Figure 2. Passing bookmarks
Example 1. Pass bookmarks
// Create a company node
private IResultSummary AddCompany(IQueryRunner tx, string name)
{
    return tx.Run("CREATE (a:Company {name: $name})", new { name }).Consume();
}

// Create a person node
private IResultSummary AddPerson(IQueryRunner tx, string name)
{
    return tx.Run("CREATE (a:Person {name: $name})", new { name }).Consume();
}

// Create an employment relationship to a pre-existing company node.
// This relies on the person first having been created.
private IResultSummary Employ(IQueryRunner tx, string personName, string companyName)
{
    return tx.Run(
            @"MATCH (person:Person {name: $personName}) 
                 MATCH (company:Company {name: $companyName}) 
                 CREATE (person)-[:WORKS_FOR]->(company)",
            new { personName, companyName })
        .Consume();
}

// Create a friendship between two people.
private IResultSummary MakeFriends(IQueryRunner tx, string name1, string name2)
{
    return tx.Run(
            @"MATCH (a:Person {name: $name1}) 
                 MATCH (b:Person {name: $name2})
                 MERGE (a)-[:KNOWS]->(b)",
            new { name1, name2 })
        .Consume();
}

// Match and display all friendships.
private int PrintFriendships(IQueryRunner tx)
{
    var result = tx.Run("MATCH (a)-[:KNOWS]->(b) RETURN a.name, b.name");

    var count = 0;
    foreach (var record in result)
    {
        count++;
        Console.WriteLine($"{record["a.name"]} knows {record["b.name"]}");
    }

    return count;
}

public void AddEmployAndMakeFriends()
{
    // To collect the session bookmarks
    var savedBookmarks = new List<Bookmarks>();

    // Create the first person and employment relationship.
    using (var session1 = Driver.Session(o => o.WithDefaultAccessMode(AccessMode.Write)))
    {
        session1.ExecuteWrite(tx => AddCompany(tx, "Wayne Enterprises"));
        session1.ExecuteWrite(tx => AddPerson(tx, "Alice"));
        session1.ExecuteWrite(tx => Employ(tx, "Alice", "Wayne Enterprises"));

        savedBookmarks.Add(session1.LastBookmarks);
    }

    // Create the second person and employment relationship.
    using (var session2 = Driver.Session(o => o.WithDefaultAccessMode(AccessMode.Write)))
    {
        session2.ExecuteWrite(tx => AddCompany(tx, "LexCorp"));
        session2.ExecuteWrite(tx => AddPerson(tx, "Bob"));
        session2.ExecuteWrite(tx => Employ(tx, "Bob", "LexCorp"));

        savedBookmarks.Add(session2.LastBookmarks);
    }

    // Create a friendship between the two people created above.
    using (var session3 = Driver.Session(
               o =>
                   o.WithDefaultAccessMode(AccessMode.Write).WithBookmarks(savedBookmarks.ToArray())))
    {
        session3.ExecuteWrite(tx => MakeFriends(tx, "Alice", "Bob"));

        session3.ExecuteRead(PrintFriendships);
    }
}

Routing transactions using access modes

Transactions can be executed in either read or write mode; this is known as the access mode. In a Causal Cluster, each transaction will be routed to an appropriate server based on the mode. When using a single instance, all transactions will be passed to that one server.

Routing Cypher by identifying reads and writes can improve the utilization of available cluster resources. Since read servers are typically more plentiful than write servers, it is beneficial to direct read traffic to read servers instead of the write server. Doing so helps in keeping write servers available for write transactions.

Access mode is generally specified by the method used to call the transaction function. Session classes provide a method for calling reads and another for writes.

As a fallback for auto-commit and unmanaged transactions, a default access mode can also be provided at session level. This is only used in cases when the access mode cannot otherwise be specified. In case a transaction function is used within that session, the default access mode will be overridden.

The driver does not parse Cypher and therefore cannot automatically determine whether a transaction is intended to carry out read or write operations. As a result, a write transaction tagged as a read will still be sent to a read server, but will fail on execution.
Example 2. Read-write transaction
public long AddPerson(string name)
{
    using var session = Driver.Session();
    session.ExecuteWrite(tx => CreatePersonNode(tx, name));
    return session.ExecuteRead(tx => MatchPersonNode(tx, name));
}

private static IResultSummary CreatePersonNode(IQueryRunner tx, string name)
{
    return tx.Run("CREATE (a:Person {name: $name})", new { name }).Consume();
}

private static long MatchPersonNode(IQueryRunner tx, string name)
{
    var result = tx.Run("MATCH (a:Person {name: $name}) RETURN id(a)", new { name });
    return result.Single()[0].As<long>();
}

Databases and execution context

Neo4j offers the ability to work with multiple databases within the same DBMS.

For Community Edition, this is limited to one user database, plus the system database.

From a driver API perspective, sessions have a DBMS scope, and the default database for a session can be selected on session construction. The default database is used as a target for queries that don’t explicitly specify a database with a USE clause. See Cypher Manual → USE for details about the USE clause.

In a multi-database environment, the server tags one database as default. This is selected whenever a session is created without naming a particular database as default. In an environment with a single database, that database is always the default.

For more information about managing multiple databases within the same DBMS, refer to Cypher Manual → Neo4j databases and graphs, which has a full breakdown of the Neo4j data storage hierarchy.

The following example illustrates how to work with databases:

var session = driver.session(SessionConfig.forDatabase( "foo" ))
// lists nodes from database foo
session.run("MATCH (n) RETURN n").list()

// lists nodes from database bar
session.run("USE bar MATCH (n) RETURN n").list()

// creates an index in foo
session.run("CREATE INDEX foo_idx FOR (n:Person) ON n.id").consume()

// creates an index in bar
session.run("USE bar CREATE INDEX bar_idx FOR (n:Person) ON n.id").consume()

// targets System database
session.run("SHOW USERS")

Database selection

You pass the name of the database to the driver during session creation. If you don’t specify a name, the default database is used. The database name must not be null, nor an empty string.

The selection of database is only possible when the driver is connected against Neo4j Enterprise Edition. Changing to any other database than the default database in Neo4j Community Edition will lead to a runtime error.

The following example illustrates the concept of DBMS Cypher Manual → Transactions and shows how queries to multiple databases can be issued in one driver transaction. It is annotated with comments describing how each operation impacts database transactions.

var session = driver.session(SessionConfig.forDatabase( "foo" ))
// a DBMS-level transaction is started
var transaction = session.begin()

// a transaction on database "foo" is started automatically with the first query targeting foo
transaction.run("MATCH (n) RETURN n").list()

// a transaction on database "bar" is started
transaction.run("USE bar MATCH (n) RETURN n").list()

// executes in the transaction on database "foo"
transaction.run("RETURN 1").consume()

// executes in the transaction on database "bar"
transaction.run("USE bar RETURN 1").consume()

// commits the DBMS-level transaction which commits the transactions on databases "foo" and
// "bar"
transaction.commit()

Please note that the database that is requested must exist.

Example 3. Database selection on session creation
using (var session = _driver.Session(SessionConfigBuilder.ForDatabase("examples")))
{
    session.Run("CREATE (a:Greeting {message: 'Hello, Example-Database'}) RETURN a").Consume();
}

void SessionConfig(SessionConfigBuilder configBuilder)
{
    configBuilder.WithDatabase("examples")
        .WithDefaultAccessMode(AccessMode.Read)
        .Build();
}

using (var session = _driver.Session(SessionConfig))
{
    var result = session.Run("MATCH (a:Greeting) RETURN a.message as msg");
    var msg = result.Single()[0].As<string>();
    Console.WriteLine(msg);
}

Type mapping

Drivers translate between application language types and the Cypher Types.

To pass parameters and process results, it is important to know the basics of how Cypher works with types and to understand how the Cypher Types are mapped in the driver.

The table below shows the available data types. All types can be potentially found in the result, although not all types can be used as parameters.

Cypher Type Parameter Result

NULL*

LIST

MAP

BOOLEAN

INTEGER

FLOAT

STRING

ByteArray

DATE

ZONED TIME

LOCAL TIME

ZONED DATETIME

LOCAL DATETIME

DURATION

POINT

NODE**

RELATIONSHIP**

PATH**

* The null marker is not a type but a placeholder for absence of value. For information on how to work with null in Cypher, please refer to Cypher Manual → Working with null.

** Nodes, relationships and paths are passed in results as snapshots of the original graph entities. While the original entity IDs are included in these snapshots, no permanent link is retained back to the underlying server-side entities, which may be deleted or otherwise altered independently of the client copies. Graph structures may not be used as parameters because it depends on application context whether such a parameter would be passed by reference or by value, and Cypher provides no mechanism to denote this. Equivalent functionality is available by simply passing either the ID for pass-by-reference, or an extracted map of properties for pass-by-value.

The Neo4j Drivers map Cypher Types to and from native language types as depicted in the table below. Custom types (those not available in the language or standard library) are highlighted in bold.

Table 1. Map Neo4j types to .NET types
Neo4j Cypher Type .NET Type

NULL

null

LIST

IList<object>

MAP

IDictionary<string, object>

BOOLEAN

bool

INTEGER

long

FLOAT

double

STRING

string

ByteArray

byte[]

DATE

LocalDate

ZONED TIME

OffsetTime

LOCAL TIME

LocalTime

ZONED DATETIME*

ZonedDateTime

LOCAL DATETIME

LocalDateTime

DURATION

Duration

POINT

Point

NODE

INode

RELATIONSHIP

IRelationship

PATH

IPath

* Time zone names adhere to the IANA system, rather than the Windows system. Inbound conversion is carried out using Extended Windows-Olson zid mapping as defined by Unicode CLDR.

Exceptions and error handling

When executing Cypher or carrying out other operations with the driver, certain exceptions and error cases may arise. Server-generated exceptions are each associated with a status code that describes the nature of the problem and a message that provides more detail.

The classifications are listed in the table below.

Table 2. Server status code classifications
Classification Description

ClientError

The client application has caused an error. The application should amend and retry the operation.

DatabaseError

The server has caused an error. Retrying the operation will generally be unsuccessful.

TransientError

A temporary error has occurred. The application should retry the operation.

Service unavailable

A Service Unavailable exception will be signalled when the driver is no longer able to establish communication with the server, even after retries.

Encountering this condition usually indicates a fundamental networking or database problem.

While certain mitigations can be made by the driver to avoid this issue, there will always be cases when this is impossible. As such, it is highly recommended to ensure client applications contain a code path that can be followed when the client is no longer able to communicate with the server.

Transient errors

Transient errors are those which are generated by the server and marked as safe to retry without alteration to the original request. Examples of such errors are deadlocks and memory issues.

When using transaction functions, the driver will usually be able to automatically retry when a transient failure occurs.

The exception property CanBeRetried gives insights into whether a further attempt might be successful.