Knowledge Base

Neo4j’s commit process explained

This article will try to guide you through Neo4j’s commit and replication processes both for single instances and causal clusters.

 

Single Instance

When you call tx.commit(), the transaction will go through the Storage Engine which will transform that transaction into a Transaction Representation. This is similar to what you get when you dump a transaction log and contains all of the commands generated by that transaction:

Command[
-Node[0,used=false,rel=-1,prop=-1,labels=Inline(0x0:[]),light,secondaryUnitId=-1]
  +Node[0,used=true,rel=-1,prop=-1,labels=Inline(0x0:[]),light,secondaryUnitId=-1]
]

1574356393567

Image 1 - Storage Engine

On a single instance, this Transaction Representation is then passed on to the Transaction Commit Process which will effectively write that transaction to the transaction log. This internally calls appendToLog(). After that, the Transaction Representation will go to the Record Store Engine which then persists that transaction to disk (applyToStore())

applyToStore() doesn’t necessarily happen together with appendToLog() but rather happens during a checkpoint operation or when a dirty page is flushed from the pagecache.

1574356466764

Image 2 - Transaction Commit Process

1574356588784

Image 3 - Record Storage Engine

This is the process for a single instance which is fairly simple. Naturally, it doesn’t involve any RAFT components.

Causal Cluster

For a Causal Cluster, the work will be done on the Leader. Everything in the process is the same, but the Transaction Commit Process is intercepted before flushing the transaction to the log:

1574356830851

Image 4 - Transaction Commit Process

The Transaction Representation is intercepted by the Replicated Transaction Commit Process which turns the Transaction Representation into a Raft Message (commit()). It is then replicated by a component called Raft Replicator (replicate()). The way this replication occurs is the following:

  1. The Leader will send an append to to followers saying it’s got a new message

  2. Followers append that message to their own RAFT logs and send a response back saying it’s been appended

  3. The Leader then gets that message and sends a commit message saying all is ok in both sides and it’s safe to commit

1574357483281

Image 5 - Replication

SOME CONSIDERATIONS

  • In this process, you may see the Leader sending append request to itself. This is intended behaviour as the Leader sees itself as a Core instance of the cluster and also needs to append to its own RAFT log.

  • The flow is: APPEND > APPEND RESPONSE > COMMIT > APPEND (…​)

  • The cluster only needs a majority of instances to ack the message. For this reason, some messages sent to Followers that may have already be committed.

  • We use message pipelining, where a Commit message can also include an Append to allow for faster processing.

  • All messages across the network are considered heartbeats.

After this happens, the Transaction Representation goes through to a queue of Transaction Representations we call the Replicated Transaction State Machine (applyCommand()) and this keeps track of the transactions and what order they need to be applied to the store.

1574357973780

Image 6 - Replicated Transaction State Machine

From there, these Transaction Representations will go through the Commit Process which will then connect back to the Transaction Commit Process (image 2) in order to flush to the transaction log and finally apply to store (image 3)