Concurrent data access

Isolation levels

Neo4j supports the following isolation levels:

read-committed isolation level: Default A transaction that reads a node/relationship does not block another transaction from writing to that node/relationship before the first transaction finishes. This type of isolation is weaker than serializable isolation level but offers significant performance advantages while being sufficient for the overwhelming majority of cases.
serializable isolation level: Explicit locking of nodes and relationships. Using locks allows for simulating the effects of higher levels of isolation by obtaining and releasing locks explicitly. For example, if a write lock is taken on a common node or relationship, then all transactions are serialized on that lock — giving the effect of a serializable isolation level. For more information on how to manually acquire write locks, see Lost updates.

Anomalies

Depending on the isolation level, different anomalies may occur when multiple transactions concurrently read or write the same data.

All the anomalies listed here can only occur with the read-committed isolation level.

Lost updates

In Cypher, it is possible to acquire write locks to simulate improved isolation in some cases. Consider the case where multiple concurrent Cypher queries increment the value of a property. Due to the limitations of the read-committed isolation level, the increments might not result in a deterministic final value.

Cypher automatically acquires write locks in some cases, but not in others. When a Cypher query uses the SET clause to update a property, it may or may not acquire a write lock on the node or relationship being updated, depending on whether there is a direct dependency on the property being read.

Acquiring a write lock automatically

When a Cypher query has a direct dependency on the property being read, Cypher automatically acquires a write lock before reading the property. This is the case when the query uses the SET clause to update a property on a node or relationship, and the right-hand side of the SET clause has a dependency on the property being read. For example, in the following queries, the right-hand side of SET has a dependent property read in an expression or a value of a key-value pair in a literal map.

Example 1. Incrementing a property using an expression

MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1

This query increments the property n.prop by 1. In this case, Cypher automatically acquires a write lock on the node n before reading the value of n.prop. This ensures that no other concurrent queries can modify the node n while this query is running, thus preventing lost updates.

Example 2. Incrementing a property using a map literal

MATCH (n)
SET n += {prop: n.prop + 1}

This query also increments the property n.prop by 1, but it does so using a map literal. In this case, Cypher also acquires a write lock on the node n before reading the value of n.prop.

No direct dependency to acquire a write lock

When a query does not have a direct dependency on the property being read, Cypher does not automatically acquire a write lock. This means if you run multiple concurrent queries that read and write the same property, it is possible to end up with lost updates by allowing other concurrent queries to modify the property value at the same time.

For example, if you run the following queries by one hundred concurrent clients, it is very likely not to increment the property n.prop to 100, unless a write lock is acquired before reading the property value. This is because all queries read the value of n.prop within their own transaction, and cannot see the incremented value from any other transaction that has not yet been committed. In the worst-case scenario, the final value would be as low as 1 if all threads perform the read before any has committed their transaction.

Example 3. Variable depending on results from reading the property in an earlier statement

MATCH (n)
WITH n.prop AS p
// ... operations depending on p, producing k
SET n.prop = k + 1

Example 4. Circular dependency between properties read and written in the same query

MATCH (n)
SET n += {propA: n.propB + 1, propB: n.propA + 1}

Workaround

To ensure deterministic behavior also in the more complex cases, it is necessary to explicitly acquire a write lock on the node in question. In Cypher there is no explicit support for this, but it is possible to work around this limitation by writing to a temporary property. For example, the following query acquires a write lock for the node by writing to a dummy property (n.dummy) before reading the requested value (n.prop). When acquired, the write lock ensures that no other concurrent queries can modify the node until the transaction is committed or rolled back. The dummy property is used only to acquire the write lock, therefore, it can be removed immediately after the lock is acquired.

Example 5. Dummy property to acquire a write lock

MATCH (n:Example {id: 42})
SET n._dummy_ = true
REMOVE n._dummy_
WITH n.prop AS p
// ... operations depending on p, producing k
SET n.prop = k + 1

Non-repeatable reads

A non-repeatable read is when the same transaction reads the same data but gets inconsistent results. This can easily happen if reading the same data twice in a query and the data gets modified in-between by another concurrent query.

For example, the following query shows that reading the same property twice can give inconsistent results. If there are other queries running concurrently, it is not guaranteed that p1 and p2 have the same value.

Example 6. Non-repeatable read

MATCH (n:Example {id: 42})
WITH n.prop AS p1
// another concurrent query changes the value of n.prop here.
WITH *, n.prop AS p2
RETURN p1, p2

The easiest way to work around this is to only read each property once, and keep it as long as needed in the query.

Missing and double reads

When scanning an index, entities may be observed multiple times or skipped entirely, even if they are present in the index. This is true even for indexes that back property uniqueness constraints.

During the scan, if another concurrent query changes an entity’s property to a position ahead of the scan, the entity might appear again in the index. Similarly, the entity may not appear at all if the property is changed to a previously scanned position.

This anomaly can only occur with operators that scan an index, or parts of an index, for example NodeIndexScan or DirectedRelationshipIndexSeekByRange.

In the following query, each node n that has the property prop is expected to appear exactly once. However, concurrent updates that modify the prop property during index scanning may cause a node to appear multiple times or not at all in the result set.

Example 7. Missing and double read

MATCH (n:Example) WHERE n.prop IS NOT NULL
RETURN n

Locks

When a write transaction occurs, Neo4j takes locks to preserve data consistency while updating.

Locks are used in Neo4j to ensure data consistency and isolation levels. They not only protect logical entities (such as nodes and relationships) but also the integrity of internal data structures.

Locks are taken automatically by the queries that users run. They ensure that a node/relationship is locked to one particular transaction until that transaction is completed. In other words, a lock on a node or a relationship by one transaction pauses other transactions to concurrently modify the same node or relationship. As such, locks prevent concurrent modifications of shared resources between transactions.

Default locking behavior

The locks are added to the transaction and released when the transaction finishes. If the transaction is rolled back, the locks are released immediately.

The following is the default locking behavior for different operations:

When adding, changing, or removing a property on a node or relationship, a write lock is taken on the specific node or relationship.
When creating or deleting a node a write lock is taken for the specific node.
When creating or deleting a relationship a write lock is taken on the specific relationship and both its nodes.

To view all active locks held by the transaction executing a query with the queryId, use the CALL dbms.listActiveLocks(queryId) procedure. You need to be an administrator to be able to run this procedure.

Table 1. Procedure output
Name	Type	Description
`mode`	`String`	Lock mode corresponding to the transaction.
`resourceType`	`String`	Resource type of the locked resource.
`resourceId`	`Integer`	Resource ID of the locked resource.

Example 8. Viewing active locks for a query

The following example shows the active locks held by the transaction executing a given query.

To get the IDs of the currently executing queries, yield the currentQueryId from the SHOW TRANSACTIONS command:
```
SHOW TRANSACTIONS YIELD currentQueryId, currentQuery
```
Run CALL dbms.listActiveLocks passing the currentQueryId of interest (query-614 in this example):
```
CALL dbms.listActiveLocks( "query-614" )
```

╒════════╤══════════════╤════════════╕
│"mode"  │"resourceType"│"resourceId"│
╞════════╪══════════════╪════════════╡
│"SHARED"│"SCHEMA"      │0           │
└────────┴──────────────┴────────────┘
1 row

Lock contention

Lock contention may arise if an application needs to perform concurrent updates on the same nodes/relationships. In such a scenario, to be completed, transactions must wait for locks held by other transactions to be released. If two or more transactions attempt to modify the same data concurrently, it will increase the likelihood of a deadlock. In larger graphs, it is less likely that two transactions modify the same data concurrently, and so the likelihood of a deadlock is reduced. That said, even in large graphs, a deadlock can occur if two or more transactions are attempting to modify the same data concurrently.

Types of acquired locks

The following table shows the type of lock acquired depending on the graph modification:

Table 2. Obtained locks for graph modifications
Modification	Acquired lock
Creating a node	No lock
Updating a node label	`NODE` lock
Updating a node property	`NODE` lock
Deleting a node	`NODE` lock
Creating a relationship*	If the node is sparse: `NODE` lock. If a node is dense: `NODE DELETE` prevention lock.
Updating a relationship property	`RELATIONSHIP` lock
Deleting a relationship*	If the node is sparse: `NODE` lock. If a node is dense: `NODE DELETE` prevention lock. `RELATIONSHIP` lock for both sparse and dense nodes.

*Applies for both source nodes and target nodes.

Additional locks are often taken to maintain indexes and other internal structures depending on how other data in the graph is affected by a transaction. For these additional locks, no assumptions or guarantees can be made concerning which lock will or will not be taken.

Locks for dense nodes

This Locks for dense nodes section describes the behavior of the standard, aligned, and high_limit store formats. The block format has a similar but not identical feature.

A node is considered dense if it at any point has had 50 or more relationships (i.e. it will still be considered dense even if it comes to have less than 50 relationships at any point in the future). A node is considered sparse if it has never had more than 50 relationships. You can configure the relationship count threshold for when a node is considered dense by setting db.relationship_grouping_threshold configuration parameter.

When creating or deleting relationships in Neo4j, dense nodes are not exclusively locked during a transaction. Rather, internally shared locks prevent the deletion of nodes, and shared degree locks are acquired for synchronizing with concurrent label changes for those nodes to ensure correct count updates.

At commit time, relationships are inserted into their relationship chains at places that are currently uncontested (i.e. not currently modified by another transaction), and the surrounding relationships are exclusively locked.

In other words, relationship modifications acquire coarse-grained shared node locks when doing the operation in the transaction, and then acquire precise exclusive relationship locks during commit.

The locking is very similar for sparse and dense nodes. The biggest contention for sparse nodes is the update of the degree (i.e. number of relationships) for the node. Dense nodes store this data in a concurrent data structure, and so can avoid exclusive node locks in almost all cases for relationship modifications.

Configure lock acquisition timeout

An executing transaction may get stuck while waiting for some lock to be released by another transaction. To kill that transaction and remove the lock, set db.lock.acquisition.timeout to some positive time interval value (e.g., 10s) denoting the maximum time interval within which any particular lock should be acquired, before failing the transaction. Setting db.lock.acquisition.timeout to 0 — which is the default value — disables the lock acquisition timeout.

This feature cannot be set dynamically.

Example 9. Set the timeout to ten seconds

db.lock.acquisition.timeout=10s

Deadlocks

Since locks are used, deadlocks can happen. A deadlock occurs when two transactions are blocked by each other because they are attempting to concurrently modify a node or a relationship that is locked by the other transaction. In such a scenario, neither of the transactions will be able to proceed. When Neo4j detects a deadlock, the transaction is terminated with the transient error message code Neo.TransientError.Transaction.DeadlockDetected. From 5.25 onwards, the error message also contains the GQLSTATUS code 50N05 and the status description error: general processing exception - deadlock detected. Deadlock detected while trying to acquire locks. See log for more details.

All locks acquired by the transaction are still held but will be released when the transaction finishes. Once the locks are released, other transactions that were waiting for locks held by the transaction causing the deadlock can proceed. You can then retry the work performed by the transaction causing the deadlock if needed.

Experiencing frequent deadlocks is an indication of concurrent write requests happening in such a way that it is not possible to execute them while at the same time living up to the intended isolation and consistency. The solution is to make sure concurrent updates happen reasonably. For example, given two specific nodes (A and B), adding or deleting relationships to both these nodes in random order for each transaction results in deadlocks when two or more transactions do that concurrently. One option is to make sure that updates always happen in the same order (first A then B). Another option is to make sure that each thread/transaction does not have any conflicting writes to a node or relationship as some other concurrent transaction. This can, for example, be achieved by letting a single thread do all updates of a specific type.

Deadlocks caused by the use of other synchronization than the locks managed by Neo4j can still happen. Other code that requires synchronization should be synchronized in such a way that it never performs any Neo4j operation in the synchronized block.

Deadlock detection

For example, running the following two queries in Cypher-shell at the same time will result in a deadlock because they are attempting to modify the same node properties concurrently:

Example 10. Transaction A

:begin
MATCH (n:Test) SET n.prop = 1
WITH collect(n) as nodes
CALL apoc.util.sleep(5000)
MATCH (m:Test2) SET m.prop = 1;

Example 11. Transaction B

:begin
MATCH (n:Test2) SET n.prop = 1
WITH collect(n) as nodes
CALL apoc.util.sleep(5000)
MATCH (m:Test) SET m.prop = 1;

The following error message is thrown:

The transaction will be rolled back and terminated. Error: ForsetiClient[transactionId=6698, clientId=1] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=6697, clientId=3]} on NODE(27), because holders of that lock are waiting for ForsetiClient[transactionId=6698, clientId=1].
 Wait list:ExclusiveLock[
Client[6697] waits for [ForsetiClient[transactionId=6698, clientId=1]]]

The Cypher clause MERGE takes locks out of order to ensure the uniqueness of the data, and this may prevent Neo4j’s internal sorting operations from ordering transactions in a way that avoids deadlocks. When possible, you are, therefore, encouraged to use the Cypher clause CREATE instead, which does not take locks out of order.

Deadlock handling in code

When dealing with deadlocks in code, there are several issues you may want to address:

Only do a limited amount of retries, and fail if a threshold is reached.
Pause between each attempt to allow the other transaction to finish before trying again.
A retry loop can be useful not only for deadlocks but for other types of transient errors as well.

For an example of how deadlocks can be handled in procedures, server extensions, or when using Neo4j embedded, see Transaction management in the Neo4j Java Reference.

Avoiding deadlocks

Most likely, a deadlock will be resolved by retrying the transaction. This will, however, negatively impact the total transactional throughput of the database, so it is useful to know about strategies to avoid deadlocks.

Neo4j assists transactions by internally sorting operations. See below for more information about internal locks). However, this internal sorting only applies to the locks taken when creating or deleting relationships. Users are, therefore, encouraged to sort their operations in cases where Neo4j does not internally assist, such as when locks are taken for property updates. This is done by ensuring that updates occur in the same order. For example, if the three locks A, B, and C are always taken in the same order (e.g. A→B→C), then a transaction will never hold lock B while waiting for lock A to be released, and so a deadlock will not occur.

Another option is to avoid lock contention by not modifying the same entities concurrently.

To avoid deadlocks, internal locks should be taken in the following order:

The internal lock types may change without any notification between different Neo4j versions. The lock types are only listed here to give an idea of the internal locking mechanism.

Lock type Locked entity Description

LABEL or RELATIONSHIP_TYPE

Token id

Schema locks, which lock indexes and constraints on the particular label or relationship type.

SCHEMA_NAME

Schema name

Lock a schema name to avoid duplicates.

Collisions are possible because the hash is stringed. This only affects concurrency and not correctness.

NODE_RELATIONSHIP_GROUP_DELETE

Node id

Lock taken on a node during the transaction creation phase to prevent deletion of that node and/or relationship group. This is different from the NODE lock in order to allow concurrent label and property changes together with relationship modifications.

NODE

Node id

Lock on a node, used to prevent concurrent updates to the node records (i.e. add/remove label, set property, add/remove relationship). Note that updating relationships will only require a lock on the node if the head of the relationship chain/relationship group chain must be updated since that is the only data part of the node record.

DEGREES

Node id

Used to lock nodes to avoid concurrent label changes when a relationship is added or deleted. Such an update would otherwise lead to an inconsistent count store.

RELATIONSHIP_DELETE

Relationship id

Lock a relationship for exclusive access during deletion.

RELATIONSHIP_GROUP

Node id

Lock the full relationship group chain for a given dense node. This will not lock the node, in contrast to the lock NODE_RELATIONSHIP_GROUP_DELETE.

RELATIONSHIP

Relationship

Lock on a relationship, or more specifically a relationship record, to prevent concurrent updates.

Delete semantics

When deleting a node or a relationship, all properties for that entity will be automatically removed but the relationships of a node will not be removed. Neo4j enforces a constraint (upon commit) that all relationships must have a valid start node and end node. In effect, this means that trying to delete a node that still has relationships attached to it will throw an exception upon commit. It is, however, possible to choose in which order to delete the node and the attached relationships as long as no relationships exist when the transaction is committed.

The delete semantics can be summarized as follows:

All properties of a node or relationship will be removed when it is deleted.
A deleted node cannot have any attached relationships when the transaction commits.
It is possible to acquire a reference to a deleted relationship or node that has not yet been committed.
Any write operation on a node or relationship after it has been deleted (but not yet committed) will throw an exception.
Trying to acquire a new or work with an old reference to a deleted node or relationship after commit, will throw an exception.