Knowledge Base

When to use bookmarks

Bookmarks are part of a broader topic: Causal consistency. We recommend reading the introduction to Neo4j Causal Clustering and the lifecycle of a Neo4j Causal Cluster before reading further. Pay special attention to Causal Consistency explained.

Bookmarks ensure than when reading data from the cluster, that data read represents the user’s most recent view of the graph. When using bookmarked transactions, you are effectively saying: "Use a particular instance, only when its able to honour this bookmark" (in other words, after they processed and applied the bookmark).

Unfortunately there isn’t a one-size-fits-all scenario when it comes to understanding when it makes sense to use bookmarks. If we look back at all the information about raft and application of transactions to the followers (links above), we know this:

  • The speed of the data propagation can be seen as (from faster to slower data readiness):

    1. Leader

    2. Majority of followers

    3. Rest of followers and read replicas

  • Leader has all the transactions and is always the most up-to-date instance

  • Majority of followers have the transactions due to raft’s nature (but may not have applied them yet)

  • Rest of followers (and read replicas) will eventually have the transactions at a later period in time

With this information you can make design choices such as:

  • Use bookmarks for queries where you absolutely need to read your own writes and:

    1. Direct read queries that are latency sensitive to the leader1, making use of bolt for direct connection instead of bolt+routing.

    2. Use bolt+routing for other queries that are not as latency sensitive (these queries will be routed to the followers)

  • Do not use bookmarks for other queries that do not require a most up-to-date view of the graph (this queries will be routed to a random follower/read-replica)

1 Be careful when making the decision to direct read transactions to the leader. You will want to avoid stressing the leader to a point where it cannot serve more requests. You can read more about this topic here.

This is only an example but all of it is achievable with a mix of direct/routed connections and bookmarks. Remember that bookmarks exist on a transaction level, which means that you can tweak this to your need in order to achieve the optimal throughput and experience. You might have several clients with different consistency and data readiness requirements and adjust the bookmark usage per-client basis.