Locks and deadlocks

When a write transaction occurs, Neo4j takes locks to preserve data consistency while updating.

Locks

Locks are taken automatically by the queries that users run. They ensure that a node/relationship is locked to one particular transaction until that transaction is completed. In other words, a lock on a node or a relationship by one transaction pauses other transactions to concurrently modify the same node or relationship. As such, locks prevent concurrent modifications of shared resources between transactions.

Isolation levels

Locks are used in Neo4j to ensure data consistency and isolation levels. They not only protect logical entities (such as nodes and relationships) but also the integrity of internal data structures.

Neo4j supports the following isolation levels:

read-committed isolation level

Default A transaction that reads a node/relationship does not block another transaction from writing to that node/relationship before the first transaction finishes. This type of isolation is weaker than serializable isolation level but offers significant performance advantages while being sufficient for the overwhelming majority of cases.

serializable isolation level

Explicit locking of nodes and relationships. Using locks allows for simulating the effects of higher levels of isolation by obtaining and releasing locks explicitly. For example, if a write lock is taken on a common node or relationship, then all transactions are serialized on that lock — giving the effect of a serializable isolation level. For more information on how to manually acquire write locks, see Lost updates in Cypher.

Lost updates in Cypher

In Cypher, it is possible to acquire write locks to simulate improved isolation in some cases. Consider the case where multiple concurrent Cypher queries increment the value of a property. Due to the limitations of the read-committed isolation level, the increments might not result in a deterministic final value. If there is a direct dependency, Cypher automatically acquires a write lock before reading. A direct dependency is when the right-hand side of SET has a dependent property read in the expression or the value of a key-value pair in a literal map.

For example, if you run the following query by one hundred concurrent clients, it is very likely not to increment the property n.prop to 100, unless a write lock is acquired before reading the property value. This is because all queries read the value of n.prop within their own transaction, and cannot see the incremented value from any other transaction that has not yet been committed. In the worst-case scenario, the final value would be as low as 1 if all threads perform the read before any has committed their transaction.

Example 1. Cypher can acquire a write lock

The following example requires a write lock, and Cypher automatically acquires one:

MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
Example 2. Cypher can acquire a write lock

This example also requires a write lock, and Cypher automatically acquires one:

MATCH (n)
SET n += {prop: n.prop + 1}

Due to the complexity of determining such a dependency in the general case, Cypher does not cover any of the following example cases:

Example 3. Complex Cypher

Variable depending on results from reading the property in an earlier statement:

MATCH (n)
WITH n.prop AS p
// ... operations depending on p, producing k
SET n.prop = k + 1
Example 4. Complex Cypher

Circular dependency between properties read and written in the same query:

MATCH (n)
SET n += {propA: n.propB + 1, propB: n.propA + 1}

To ensure deterministic behavior also in the more complex cases, it is necessary to explicitly acquire a write lock on the node in question. In Cypher there is no explicit support for this, but it is possible to work around this limitation by writing to a temporary property.

Example 5. Explicitly acquire a write lock

This example acquires a write lock for the node by writing to a dummy property before reading the requested value:

MATCH (n:Example {id: 42})
SET n._LOCK_ = true
WITH n.prop AS p
// ... operations depending on p, producing k
SET n.prop = k + 1
REMOVE n._LOCK_

The existence of the SET n._LOCK_ statement before the read of the n.prop read ensures the lock is acquired before the read action, and no updates are lost due to enforced serialization of all concurrent queries on that specific node.

Default locking behavior

The locks are added to the transaction and released when the transaction finishes. If the transaction is rolled back, the locks are released immediately.

The following is the default locking behavior for different operations:

  • When adding, changing, or removing a property on a node or relationship, a write lock is taken on the specific node or relationship.

  • When creating or deleting a node a write lock is taken for the specific node.

  • When creating or deleting a relationship a write lock is taken on the specific relationship and both its nodes.

To view all active locks held by the transaction executing a query with the queryId, use the CALL dbms.listActiveLocks(queryId) procedure. You need to be an administrator to be able to run this procedure.

Table 1. Procedure output
Name Type Description

mode

String

Lock mode corresponding to the transaction.

resourceType

String

Resource type of the locked resource.

resourceId

Integer

Resource ID of the locked resource.

Example 6. Viewing active locks for a query

The following example shows the active locks held by the transaction executing a given query.

  1. To get the IDs of the currently executing queries, yield the currentQueryId from the SHOW TRANSACTIONS command:

    SHOW TRANSACTIONS YIELD currentQueryId, currentQuery
  2. Run CALL dbms.listActiveLocks passing the currentQueryId of interest (query-614 in this example):

    CALL dbms.listActiveLocks( "query-614" )
╒════════╤══════════════╤════════════╕
│"mode"  │"resourceType"│"resourceId"│
╞════════╪══════════════╪════════════╡
│"SHARED"│"SCHEMA"      │0           │
└────────┴──────────────┴────────────┘
1 row

Lock contention

Lock contention may arise if an application needs to perform concurrent updates on the same nodes/relationships. In such a scenario, to be completed, transactions must wait for locks held by other transactions to be released. If two or more transactions attempt to modify the same data concurrently, it will increase the likelihood of a deadlock. In larger graphs, it is less likely that two transactions modify the same data concurrently, and so the likelihood of a deadlock is reduced. That said, even in large graphs, a deadlock can occur if two or more transactions are attempting to modify the same data concurrently.

Types of acquired locks

The following table shows the type of lock acquired depending on the graph modification:

Table 2. Obtained locks for graph modifications
Modification Acquired lock

Creating a node

No lock

Updating a node label

NODE lock

Updating a node property

NODE lock

Deleting a node

NODE lock

Creating a relationship*

If the node is sparse: NODE lock.

If a node is dense: NODE DELETE prevention lock.

Updating a relationship property

RELATIONSHIP lock

Deleting a relationship*

If the node is sparse: NODE lock.

If a node is dense: NODE DELETE prevention lock.

RELATIONSHIP lock for both sparse and dense nodes.

*Applies for both source nodes and target nodes.

Additional locks are often taken to maintain indexes and other internal structures depending on how other data in the graph is affected by a transaction. For these additional locks, no assumptions or guarantees can be made concerning which lock will or will not be taken.

Locks for dense nodes

This Locks for dense nodes section describes the behavior of the standard, aligned, and high-limit store formats. The block format has a similar but not identical feature.

A node is considered dense if it at any point has had 50 or more relationships (i.e. it will still be considered dense even if it comes to have less than 50 relationships at any point in the future). A node is considered sparse if it has never had more than 50 relationships. You can configure the relationship count threshold for when a node is considered dense by setting db.relationship_grouping_threshold configuration parameter.

When creating or deleting relationships in Neo4j, dense nodes are not exclusively locked during a transaction. Rather, internally shared locks prevent the deletion of nodes, and shared degree locks are acquired for synchronizing with concurrent label changes for those nodes to ensure correct count updates.

At commit time, relationships are inserted into their relationship chains at places that are currently uncontested (i.e. not currently modified by another transaction), and the surrounding relationships are exclusively locked.

In other words, relationship modifications acquire coarse-grained shared node locks when doing the operation in the transaction, and then acquire precise exclusive relationship locks during commit.

The locking is very similar for sparse and dense nodes. The biggest contention for sparse nodes is the update of the degree (i.e. number of relationships) for the node. Dense nodes store this data in a concurrent data structure, and so can avoid exclusive node locks in almost all cases for relationship modifications.

Configure lock acquisition timeout

An executing transaction may get stuck while waiting for some lock to be released by another transaction. To kill that transaction and remove the lock, set db.lock.acquisition.timeout to some positive time interval value (e.g., 10s) denoting the maximum time interval within which any particular lock should be acquired, before failing the transaction. Setting db.lock.acquisition.timeout to 0 — which is the default value — disables the lock acquisition timeout.

This feature cannot be set dynamically.

Example 7. Configure lock acquisition timeout

Set the timeout to ten seconds.

db.lock.acquisition.timeout=10s

Deadlocks

Since locks are used, deadlocks can happen. A deadlock occurs when two transactions are blocked by each other because they are attempting to concurrently modify a node or a relationship that is locked by the other transaction. In such a scenario, neither of the transactions will be able to proceed. When Neo4j detects a deadlock, the transaction is terminated with the transient error message code Neo.TransientError.Transaction.DeadlockDetected.

All locks acquired by the transaction are still held but will be released when the transaction finishes. Once the locks are released, other transactions that were waiting for locks held by the transaction causing the deadlock can proceed. You can then retry the work performed by the transaction causing the deadlock if needed.

Experiencing frequent deadlocks is an indication of concurrent write requests happening in such a way that it is not possible to execute them while at the same time living up to the intended isolation and consistency. The solution is to make sure concurrent updates happen reasonably. For example, given two specific nodes (A and B), adding or deleting relationships to both these nodes in random order for each transaction results in deadlocks when two or more transactions do that concurrently. One option is to make sure that updates always happen in the same order (first A then B). Another option is to make sure that each thread/transaction does not have any conflicting writes to a node or relationship as some other concurrent transaction. This can, for example, be achieved by letting a single thread do all updates of a specific type.

Deadlocks caused by the use of other synchronization than the locks managed by Neo4j can still happen. Other code that requires synchronization should be synchronized in such a way that it never performs any Neo4j operation in the synchronized block.

Deadlock detection

For example, running the following two queries in Cypher-shell at the same time will result in a deadlock because they are attempting to modify the same node properties concurrently:

Transaction A
:begin
MATCH (n:Test) SET n.prop = 1
WITH collect(n) as nodes
CALL apoc.util.sleep(5000)
MATCH (m:Test2) SET m.prop = 1;
Transaction B
:begin
MATCH (n:Test2) SET n.prop = 1
WITH collect(n) as nodes
CALL apoc.util.sleep(5000)
MATCH (m:Test) SET m.prop = 1;

The following error message is thrown:

The transaction will be rolled back and terminated. Error: ForsetiClient[transactionId=6698, clientId=1] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=6697, clientId=3]} on NODE(27), because holders of that lock are waiting for ForsetiClient[transactionId=6698, clientId=1].
 Wait list:ExclusiveLock[
Client[6697] waits for [ForsetiClient[transactionId=6698, clientId=1]]]

The Cypher clause MERGE takes locks out of order to ensure the uniqueness of the data, and this may prevent Neo4j’s internal sorting operations from ordering transactions in a way that avoids deadlocks. When possible, you are, therefore, encouraged to use the Cypher clause CREATE instead, which does not take locks out of order.

Deadlock handling in code

When dealing with deadlocks in code, there are several issues you may want to address:

  • Only do a limited amount of retries, and fail if a threshold is reached.

  • Pause between each attempt to allow the other transaction to finish before trying again.

  • A retry loop can be useful not only for deadlocks but for other types of transient errors as well.

For an example of how deadlocks can be handled in procedures, server extensions, or when using Neo4j embedded, see Transaction management in the Neo4j Java Reference.

Avoiding deadlocks

Most likely, a deadlock will be resolved by retrying the transaction. This will, however, negatively impact the total transactional throughput of the database, so it is useful to know about strategies to avoid deadlocks.

Neo4j assists transactions by internally sorting operations. See below for more information about internal locks). However, this internal sorting only applies to the locks taken when creating or deleting relationships. Users are, therefore, encouraged to sort their operations in cases where Neo4j does not internally assist, such as when locks are taken for property updates. This is done by ensuring that updates occur in the same order. For example, if the three locks A, B, and C are always taken in the same order (e.g. A→B→C), then a transaction will never hold lock B while waiting for lock A to be released, and so a deadlock will not occur.

Another option is to avoid lock contention by not modifying the same entities concurrently.

To avoid deadlocks, internal locks should be taken in the following order:

The internal lock types may change without any notification between different Neo4j versions. The lock types are only listed here to give an idea of the internal locking mechanism.

Lock type Locked entity Description

LABEL or RELATIONSHIP_TYPE

Token id

Schema locks, which lock indexes and constraints on the particular label or relationship type.

SCHEMA_NAME

Schema name

Lock a schema name to avoid duplicates.

Collisions are possible because the hash is stringed. This only affects concurrency and not correctness.

NODE_RELATIONSHIP_GROUP_DELETE

Node id

Lock taken on a node during the transaction creation phase to prevent deletion of that node and/or relationship group. This is different from the NODE lock in order to allow concurrent label and property changes together with relationship modifications.

NODE

Node id

Lock on a node, used to prevent concurrent updates to the node records (i.e. add/remove label, set property, add/remove relationship). Note that updating relationships will only require a lock on the node if the head of the relationship chain/relationship group chain must be updated since that is the only data part of the node record.

DEGREES

Node id

Used to lock nodes to avoid concurrent label changes when a relationship is added or deleted. Such an update would otherwise lead to an inconsistent count store.

RELATIONSHIP_DELETE

Relationship id

Lock a relationship for exclusive access during deletion.

RELATIONSHIP_GROUP

Node id

Lock the full relationship group chain for a given dense node. This will not lock the node, in contrast to the lock NODE_RELATIONSHIP_GROUP_DELETE.

RELATIONSHIP

Relationship

Lock on a relationship, or more specifically a relationship record, to prevent concurrent updates.

Delete semantics

When deleting a node or a relationship, all properties for that entity will be automatically removed but the relationships of a node will not be removed. Neo4j enforces a constraint (upon commit) that all relationships must have a valid start node and end node. In effect, this means that trying to delete a node that still has relationships attached to it will throw an exception upon commit. It is, however, possible to choose in which order to delete the node and the attached relationships as long as no relationships exist when the transaction is committed.

The delete semantics can be summarized as follows:

  • All properties of a node or relationship will be removed when it is deleted.

  • A deleted node cannot have any attached relationships when the transaction commits.

  • It is possible to acquire a reference to a deleted relationship or node that has not yet been committed.

  • Any write operation on a node or relationship after it has been deleted (but not yet committed) will throw an exception.

  • Trying to acquire a new or work with an old reference to a deleted node or relationship after commit, will throw an exception.