CALL subqueries in transactions

CALL subqueries can be made to execute in separate, inner transactions, producing intermediate commits. This can be useful when doing large write operations, like batch updates, imports, and deletes.

To execute a CALL subquery in separate transactions, add the modifier IN TRANSACTIONS after the subquery. An outer transaction is opened to report back the accumulated statistics for the inner transactions (created and deleted nodes, relationships, etc.) and it will succeed or fail depending on the results of those inner transactions. By default, inner transactions group together batches of 1000 rows. Cancelling the outer transaction will cancel the inner ones as well.

CALL { …​ } IN TRANSACTIONS is only allowed in implicit transactions.

Loading CSV data

This example uses a CSV file and the LOAD CSV clause to import data into the database. It creates nodes in separate transactions using CALL { …​ } IN TRANSACTIONS:

friends.csv
1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24
Query
LOAD CSV FROM 'file:///friends.csv' AS line
CALL {
  WITH line
  CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS
Table 1. Result

(empty result)

Rows: 0
Nodes created: 5
Properties set: 10
Labels added: 5
Transactions committed: 1

As the size of the CSV file in this example is small, only a single separate transaction is started and committed.

Deleting a large volume of data

Using CALL { …​ } IN TRANSACTIONS is the recommended way of deleting a large volume of data.

Example 1. DETACH DELETE on all nodes
Query
MATCH (n)
CALL {
  WITH n
  DETACH DELETE n
} IN TRANSACTIONS
Table 2. Result

(empty result)

Rows: 0
Nodes deleted: 5
Relationships deleted: 2
Transactions committed: 1

Example 2. DETACH DELETE on only some nodes

The CALL { …​ } IN TRANSACTIONS subquery should not be modified.

Any necessary filtering can be done before the subquery.

Query
MATCH (n:Label) WHERE n.prop > 100
CALL {
  WITH n
  DETACH DELETE n
} IN TRANSACTIONS
Table 3. Result

(empty result)

Rows: 0

The batching is performed on the input rows fed into CALL { …​ } IN TRANSACTIONS, so the data must be supplied from outside the call in order for the batching to have an effect. That is why the nodes are matched outside the subqueries in the examples above. If the MATCH clause were inside the subquery, the data deletion would run as one single transaction.

Batching

The amount of work to do in each separate transaction can be specified in terms of how many input rows to process before committing the current transaction and starting a new one. The number of input rows is set with the modifier OF n ROWS (or OF n ROW). If omitted, the default batch size is 1000 rows. The number of rows can be expressed using any expression that evaluates to a positive integer and does not refer to nodes or relationships.

This example loads a CSV file with one transaction for every 2 input rows:

friends.csv
1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24
Query
LOAD CSV FROM 'file:///friends.csv' AS line
CALL {
  WITH line
  CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS OF 2 ROWS
Table 4. Result

(empty result)

Rows: 0
Nodes created: 5
Properties set: 10
Labels added: 5
Transactions committed: 3

The query now starts and commits three separate transactions:

  1. The first two executions of the subquery (for the first two input rows from LOAD CSV) take place in the first transaction.

  2. The first transaction is then committed before proceeding.

  3. The next two executions of the subquery (for the next two input rows) take place in a second transaction.

  4. The second transaction is committed.

  5. The last execution of the subquery (for the last input row) takes place in a third transaction.

  6. The third transaction is committed.

You can also use CALL { …​ } IN TRANSACTIONS OF n ROWS to delete all your data in batches in order to avoid a huge garbage collection or an OutOfMemory exception. For example:

Query
MATCH (n)
CALL {
  WITH n
  DETACH DELETE n
} IN TRANSACTIONS OF 2 ROWS
Table 5. Result

(empty result)

Rows: 0
Nodes deleted: 9
Relationships deleted: 2
Transactions committed: 5

Up to a point, using a larger batch size will be more performant. The batch size of 2 ROWS is an example given the small data set used here. For larger data sets, you might want to use larger batch sizes, such as 10000 ROWS.

Composite databases

As of Neo4j 5.18, CALL { …​ } IN TRANSACTIONS can be used with composite databases.

Even though composite databases allow accessing multiple graphs in a single query, only one graph can be modified in a single transaction. CALL { …​ } IN TRANSACTIONS offers a way of constructing queries which modify multiple graphs.

While the previous examples are generally valid for composite databases, there’s a few extra factors that come into play when working with composite databases in subqueries. The following examples show how you can use CALL { …​ } IN TRANSACTIONS on a composite database.

Example 3. Import a CSV file on all constituents
friends.csv
1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24
Create Person nodes on all constituents, drawing data from friends.csv
UNWIND graph.names() AS graphName
LOAD CSV FROM 'file:///friends.csv' AS line
CALL {
  USE graph.byName( graphName )
  WITH line
  CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS
Example 4. Remove all nodes and relationships from all constituents
Query
UNWIND graph.names() AS graphName
CALL {
  USE graph.byName( graphName )
  MATCH (n)
  RETURN elementId(n) AS id
}
CALL {
  USE graph.byName( graphName )
  WITH id
  MATCH (n)
  WHERE elementId(n) = id
  DETACH DELETE n
} IN TRANSACTIONS
Since the batching is performed on the input rows fed into CALL { …​ } IN TRANSACTIONS, the data must be supplied from outside the subquery in order for the batching to have an effect. That is why the nodes are matched in a subquery preceding the one that actually deletes the data. If the MATCH clause were inside the second subquery, the data deletion would run as one single transaction.

There is currently a known issue. When an error occurs during CALL { …​ } IN TRANSACTIONS processing, the error message includes information about how many transactions have been committed. That information is inaccurate on composite databases, as it always reports (Transactions committed: 0).

Batch size in composite databases

Because CALL { …​ } IN TRANSACTIONS subqueries targeting different graphs can’t be interleaved, if a USE clause evaluates to a different target than the current one, the current batch is committed and the next batch is created.

The batch size declared with IN TRANSACTIONS OF …​ ROWS represents an upper limit of the batch size, but the real batch size depends on how many input rows target one database in sequence. Every time the target database changes, the batch is committed.

Example 5. Behavior of IN TRANSACTIONS OF ROWS on composite databases

The next example assumes the existence of two constituents remoteGraph1 and remoteGraph2 for the composite database composite.

While the declared batch size is 3, only the first 2 rows act on composite.remoteGraph1, so the batch size for the first transaction is 2. That is followed by 3 rows on composite.remoteGraph2, 1 on composite.remoteGraph2 and finally 2 on composite.remoteGraph1.

Query
WITH ['composite.remoteGraph1', 'composite.remoteGraph2'] AS graphs
UNWIND [0, 0, 1, 1, 1, 1, 0, 0] AS i
WITH graphs[i] AS g
CALL {
  USE graph.byName( g )
  CREATE ()
} IN TRANSACTIONS OF 3 ROWS

Error behavior

Users can choose one of three different option flags to control the behavior in case of an error occurring in any of the inner transactions of CALL { …​ } IN TRANSACTIONS:

  • ON ERROR CONTINUE to ignore a recoverable error and continue the execution of subsequent inner transactions. The outer transaction succeeds. It will cause the expected variables from the failed inner query to be bound as null for that specific transaction.

  • ON ERROR BREAK to ignore a recoverable error and stop the execution of subsequent inner transactions. The outer transaction succeeds. It will cause expected variables from the failed inner query to be bound as null for all onward transactions (including the failed one).

  • ON ERROR FAIL to acknowledge a recoverable error and stop the execution of subsequent inner transactions. The outer transaction fails. This is the default behavior if no flag is explicitly specified.

On error, any previously committed inner transactions remain committed, and are not rolled back. Any failed inner transactions are rolled back.

In the following example, the last subquery execution in the second inner transaction fails due to division by zero.

Query
UNWIND [4, 2, 1, 0] AS i
CALL {
  WITH i
  CREATE (:Person {num: 100/i})
} IN TRANSACTIONS OF 2 ROWS
RETURN i
Error message
/ by zero (Transactions committed: 1)

When the failure occurred, the first transaction had already been committed, so the database contains two example nodes.

Query
MATCH (e:Person)
RETURN e.num
Table 6. Result
e.num

25

50

Rows: 2

In the following example, ON ERROR CONTINUE is used after a failed inner transaction to execute the remaining inner transactions and not fail the outer transaction:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 1 ROW
  ON ERROR CONTINUE
RETURN n.num;
Table 7. Result
n.num

100

null

50

25

Rows: 4

Note the difference in results when batching in transactions of 2 rows:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 2 ROWS
  ON ERROR CONTINUE
RETURN n.num;
Table 8. Result
n.num

null

null

50

25

Rows: 4

This happens because an inner transaction with the two first i elements (1 and 0) was created, and it fails for 0. This causes it to be rolled back and the return variable is filled with nulls for those two elements.

In the following example, ON ERROR BREAK is used after a failed inner transaction to not execute the remaining inner transaction and not fail the outer transaction:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 1 ROW
  ON ERROR BREAK
RETURN n.num;
Table 9. Result
n.num

100

null

null

null

Rows: 4

Note the difference in results when batching in transactions of 2 rows:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 2 ROWS
  ON ERROR BREAK
RETURN n.num;
Table 10. Result
n.num

null

null

null

null

Rows: 4

In the following example, ON ERROR FAIL is used after the failed inner transaction, to not execute the remaining inner transactions and to fail the outer transaction:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 1 ROW
  ON ERROR FAIL
RETURN n.num;
Error message
/ by zero (Transactions committed: 1)

Status report

Users can also report the execution status of the inner transactions by using REPORT STATUS AS var. This flag is disallowed for ON ERROR FAIL. For more information, see Error behavior.

After each execution of the inner query finishes (successfully or not), a status value is created that records information about the execution and the transaction that executed it:

  • If the inner execution produces one or more rows as output, then a binding to this status value is added to each row, under the selected variable name.

  • If the inner execution fails then a single row is produced containing a binding to this status value under the selected variable, and null bindings for all variables that should have been returned by the inner query (if any).

The status value is a map value with the following fields:

  • started: true when the inner transaction was started, false otherwise.

  • committed, true when the inner transaction changes were successfully committed, false otherwise.

  • transactionId: the inner transaction id, or null if the transaction was not started.

  • errorMessage, the inner transaction error message, or null in case of no error.

Example of reporting status with ON ERROR CONTINUE:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 1 ROW
  ON ERROR CONTINUE
  REPORT STATUS AS s
RETURN n.num, s;
Table 11. Result
n.num s

100

{"committed": true, "errorMessage": null, "started": true, "transactionId": "neo4j-transaction-835" }

null

{"committed": false, "errorMessage": "/ by zero", "started": true, "transactionId": "neo4j-transaction-836" }

50

{"committed": true, "errorMessage": null, "started": true, "transactionId": "neo4j-transaction-837" }

25

{"committed": true, "errorMessage": null, "started": true, "transactionId": "neo4j-transaction-838" }

Rows: 4

Example of reporting status with ON ERROR BREAK:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 1 ROW
  ON ERROR BREAK
  REPORT STATUS AS s
RETURN n.num, s.started, s.committed, s.errorMessage;
Table 12. Result
n.num s.started s.committed s.errorMessage

100

true

true

null

null

true

false

"/ by zero"

null

false

false

null

null

false

false

null

Rows: 4

Reporting status with ON ERROR FAIL is disallowed:

Query
UNWIND [1, 0, 2, 4] AS i
CALL {
  WITH i
  CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
  RETURN n
} IN TRANSACTIONS
  OF 1 ROW
  ON ERROR FAIL
  REPORT STATUS AS s
RETURN n.num, s.errorMessage;
Error
REPORT STATUS can only be used when specifying ON ERROR CONTINUE or ON ERROR BREAK

Restrictions

These are the restrictions on queries that use CALL { …​ } IN TRANSACTIONS:

  • A nested CALL { …​ } IN TRANSACTIONS inside a CALL { …​ } clause is not supported.

  • A CALL { …​ } IN TRANSACTIONS in a UNION is not supported.

  • A CALL { …​ } IN TRANSACTIONS after a write clause is not supported, unless that write clause is inside a CALL { …​ } IN TRANSACTIONS.

Additionally, there are some restrictions that apply when using an importing WITH clause in a CALL subquery:

  • Only variables imported with the importing WITH clause can be used.

  • No expressions or aliasing are allowed within the importing WITH clause.

  • It is not possible to follow an importing WITH clause with any of the following clauses: DISTINCT, ORDER BY, WHERE, SKIP, and LIMIT.

Attempting any of the above, will throw an error. For example, the following query using a WHERE clause after an importing WITH clause will throw an error:

Query
UNWIND [[1,2],[1,2,3,4],[1,2,3,4,5]] AS l
CALL {
    WITH l
    WHERE size(l) > 2
    RETURN l AS largeLists
}
RETURN largeLists
Error message
Importing WITH should consist only of simple references to outside variables.
WHERE is not allowed.

A solution to this restriction, necessary for any filtering or ordering of an importing WITH clause, is to declare a second WITH clause after the importing WITH clause. This second WITH clause will act as a regular WITH clause. For example, the following query will not throw an error:

Query
UNWIND [[1,2],[1,2,3,4],[1,2,3,4,5]] AS l
CALL {
 WITH l
 WITH size(l) AS size, l AS l
 WHERE size > 2
 RETURN l AS largeLists
}
RETURN largeLists
Table 13. Result
largeLists

[1, 2, 3, 4]

[1, 2, 3, 4, 5]

Rows: 2