Clause composition

This section describes the semantics of Cypher® when composing different read and write clauses.

A query is made up from several clauses chained together. These are discussed in more detail in the chapter on Clauses.

The semantics of a whole query is defined by the semantics of its clauses. Each clause has as input the state of the graph and a table of intermediate results consisting of the current variables. The output of a clause is a new state of the graph and a new table of intermediate results, serving as input to the next clause. The first clause takes as input the state of the graph before the query and an empty table of intermediate results. The output of the last clause is the result of the query.

Unless ORDER BY is used, Neo4j does not guarantee the row order of a query result.

Example 1. Table of intermediate results between read clauses

The following example graph is used throughout this section.

Diagram

Now follows the table of intermediate results and the state of the graph after each clause for the following query:

MATCH (john:Person {name: 'John'})
MATCH (john)-[:FRIEND]->(friend)
RETURN friend.name AS friendName

The query only has read clauses, so the state of the graph remains unchanged and is therefore omitted below.

Table 1. The table of intermediate results after each clause
Clause Table of intermediate results after the clause
MATCH (john:Person {name: 'John'})
john

(:Person {name: 'John'})

MATCH (john)-[:FRIEND]->(friend)
john friend

(:Person {name: 'John'})

(:Person {name: 'Sara'})

(:Person {name: 'John'})

(:Person {name: 'Joe'})

RETURN friend.name AS friendName
friendName

'Sara'

'Joe'

The above example only looked at clauses that allow linear composition and omitted write clauses. The next section will explore these non-linear composition and write clauses.

Read-write queries

In a Cypher query, read and write clauses can take turns. The most important aspect of read-write queries is that the state of the graph also changes between clauses.

A clause can never observe writes made by a later clause.

Example 2. Table of intermediate results and state of the graph between read and write clauses

Using the same example graph as above, this example shows the table of intermediate results and the state of the graph after each clause for the following query:

MATCH (j:Person) WHERE j.name STARTS WITH "J"
CREATE (j)-[:FRIEND]->(jj:Person {name: "Jay-jay"})

The query finds all nodes where the name property starts with "J" and for each such node it creates another node with the name property set to "Jay-jay".

Table 2. The table of intermediate results and the state of the graph after each clause
Clause Table of intermediate results after the clause State of the graph after the clause, changes in red
MATCH (j:Person) WHERE j.name STARTS WITH "J"
j

(:Person {name: 'John'})

(:Person {name: 'Joe'})

Diagram
CREATE (j)-[:FRIEND]->(jj:Person {name: "Jay-jay"})
j jj

(:Person {name: 'John'})

(:Person {name: 'Jay-jay'})

(:Person {name: 'Joe'})

(:Person {name: 'Jay-jay'})

Diagram

It is important to note that the MATCH clause does not find the Person nodes that are created by the CREATE clause, even though the name "Jay-jay" starts with "J". This is because the CREATE clause comes after the MATCH clause and thus the MATCH can not observe any changes to the graph made by the CREATE.

Queries with UNION

UNION queries are slightly different because the results of two or more queries are put together, but each query starts with an empty table of intermediate results.

In a query with a UNION clause, any clause before the UNION cannot observe writes made by a clause after the UNION. Any clause after UNION can observe all writes made by a clause before the UNION. This means that the rule that a clause can never observe writes made by a later clause still applies in queries using UNION.

Example 3. Table of intermediate results and state of the graph in a query with UNION

Using the same example graph as above, this example shows the table of intermediate results and the state of the graph after each clause for the following query:

CREATE (jj:Person {name: "Jay-jay"})
RETURN count(*) AS count
  UNION
MATCH (j:Person) WHERE j.name STARTS WITH "J"
RETURN count(*) AS count
Table 3. The table of intermediate results and the state of the graph after each clause
Clause Table of intermediate results after the clause State of the graph after the clause, changes in red
CREATE (jj:Person {name: "Jay-jay"})
jj

(:Person {name: 'Jay-jay'})

Diagram
RETURN count(*) AS count
count

1

Diagram
MATCH (j:Person) WHERE j.name STARTS WITH "J"
j

(:Person {name: 'John'})

(:Person {name: 'Joe'})

(:Person {name: 'Jay-jay'})

Diagram
RETURN count(*) AS count
count

3

Diagram

It is important to note that the MATCH clause finds the Person node that is created by the CREATE clause. This is because the CREATE clause comes before the MATCH clause and thus the MATCH can observe any changes to the graph made by the CREATE.

Queries with CALL {} subqueries

Subqueries inside a CALL {} clause are evaluated for each incoming input row. This means that write clauses inside a subquery can get executed more than once. The different invocations of the subquery are executed in turn, in the order of the incoming input rows.

Later invocations of the subquery can observe writes made by earlier invocations of the subquery.

Example 4. Table of intermediate results and state of the graph in a query with CALL {}

Using the same example graph as above, this example shows the table of intermediate results and the state of the graph after each clause for the following query:

MATCH (john:Person {name: 'John'})
SET john.friends = []
WITH john
MATCH (john)-[:FRIEND]->(friend)
WITH john, friend
CALL {
  WITH john, friend
  WITH *, john.friends AS friends
  SET john.friends = friends + friend.name
}
Table 4. The table of intermediate results and the state of the graph after each clause
Clause Table of intermediate results after the clause State of the graph after the clause, changes in red
MATCH (john:Person {name: 'John'})
john

(:Person {name: 'John'})

Diagram
SET john.friends = []
john

(:Person {name: 'John', friends: []})

Diagram
MATCH (john)-[:FRIEND]->(friend)
john friend

(:Person {name: 'John', friends: []})

(:Person {name: 'Sara'})

(:Person {name: 'John', friends: []})

(:Person {name: 'Joe'})

Diagram

First invocation of

WITH *, john.friends AS friends
john friend friends

(:Person {name: 'John', friends: []})

(:Person {name: 'Sara'})

[]

Diagram

First invocation of

SET john.friends = friends + friend.name
john friend friends

(:Person {name: 'John', friends: ['Sara']})

(:Person {name: 'Sara'})

[]

Diagram

Second invocation of

WITH *, john.friends AS friends
john friend friends

(:Person {name: 'John', friends: ['Sara']})

(:Person {name: 'Joe'})

['Sara']

Diagram

Second invocation of

SET john.friends = friends + friend.name
john friend friends

(:Person {name: 'John', friends: ['Sara', 'Joe']})

(:Person {name: 'Joe'})

['Sara']

Diagram

It is important to note that, in the subquery, the second invocation of the WITH clause could observe the writes made by the first invocation of the SET clause.

Notes on the implementation

An easy way to implement the semantics outlined above is to fully execute each clause and materialize the table of intermediate results before executing the next clause. This approach would consume a lot of memory for materializing the tables of intermediate results and would generally not perform well.

Instead, Cypher will in general try to interleave the execution of clauses. This is called lazy evaluation. It only materializes intermediate results when needed. In many read-write queries it is unproblematic to execute clauses interleaved, but when it is not, Cypher must ensure that the table of intermediate results gets materialized at the right time(s). This is done by inserting an Eager operator into the execution plan.