Coordinate parallel transactions
When working with a Neo4j cluster, the driver automatically enforces causal consistency for transactions within the same session, which guarantees that a query is able to read changes made by previous queries. The same does not happen by default for multiple transactions running in parallel though. In that case, you can use bookmarks to have one transaction wait for the result of another to be propagated across the cluster before running its own work. This is not a requirement, and you should only use bookmarks if you need casual consistency across different transactions.
A bookmark is a token that represents some state of the database. By passing one or multiple bookmarks along with a query, the server will make sure that the query does not get executed before the represented state(s) have been established.
Bookmarks within a single session
Bookmark management happens automatically for queries run within a single session, so that you can trust that queries inside one session are causally chained.
with driver.session() as session:
session.execute_write(lambda tx: tx.run("<QUERY 1>"))
session.execute_write(lambda tx: tx.run("<QUERY 2>")) # can read QUERY 1
session.execute_write(lambda tx: tx.run("<QUERY 3>")) # can read QUERY 1,2
...
Bookmarks across multiple sessions
If your application uses multiple sessions instead, you may need to ensure that one session has completed all its transactions before another session is allowed to run its queries.
In those cases, you can collect the bookmarks from some sessions using the method Session.last_bookmarks()
(1), (2), store them into a Bookmarks
object, and use them to initialize another session with the bookmarks
parameter (3).
In the example below, session_a
and session_b
are allowed to run concurrently, while session_c
waits until their results have been propagated.
This guarantees the Person
nodes session_c
wants to act on actually exist.
from neo4j import GraphDatabase, Bookmarks
URI = "<URI for Neo4j database>"
AUTH = ("<Username>", "<Password>")
def main():
with GraphDatabase.driver(URI, auth=AUTH) as driver:
create_some_friends(driver)
def create_some_friends(driver):
saved_bookmarks = Bookmarks() # To collect the sessions' bookmarks
# Create the first person and employment relationship.
with driver.session() as session_a:
session_a.execute_write(create_person, "Alice")
session_a.execute_write(employ, "Alice", "Wayne Enterprises")
saved_bookmarks += session_a.last_bookmarks()
# Create the second person and employment relationship.
with driver.session() as session_b:
session_b.execute_write(create_person, "Bob")
session_b.execute_write(employ, "Bob", "LexCorp")
saved_bookmarks += session_b.last_bookmarks()
# Create a friendship between the two people created above.
with driver.session(bookmarks=saved_bookmarks) as session_c:
session_c.execute_write(create_friendship, "Alice", "Bob")
session_c.execute_read(print_friendships)
# Create a person node.
def create_person(tx, name):
tx.run("CREATE (:Person {name: $name})", name=name)
# Create an employment relationship to a pre-existing company node.
# This relies on the person first having been created.
def employ(tx, person_name, company_name):
tx.run("MATCH (person:Person {name: $person_name}) "
"MATCH (company:Company {name: $company_name}) "
"CREATE (person)-[:WORKS_FOR]->(company)",
person_name=person_name, company_name=company_name)
# Create a friendship between two people.
def create_friendship(tx, name_a, name_b):
tx.run("MATCH (a:Person {name: $name_a}) "
"MATCH (b:Person {name: $name_b}) "
"MERGE (a)-[:KNOWS]->(b)",
name_a=name_a, name_b=name_b)
# Retrieve and display all friendships.
def print_friendships(tx):
result = tx.run("MATCH (a)-[:KNOWS]->(b) RETURN a.name, b.name")
for record in result:
print("{} knows {}".format(record["a.name"], record["b.name"]))
if __name__ == "__main__":
main()
The use of bookmarks can negatively impact performance, since all queries are forced to wait for the latest changes to be propagated across the cluster. For simple use-cases, try to group queries within a single transaction, or within a single session. |
Glossary
- LTS
-
A Long Term Support release is one guaranteed to be supported for a number of years. Neo4j 4.4 is LTS, and Neo4j 5 will also have an LTS version.
- Aura
-
Aura is Neo4j’s fully managed cloud service. It comes with both free and paid plans.
- Driver
-
A
Driver
object holds the details required to establish connections with a Neo4j database. Every Neo4j-backed application requires aDriver
object. - Cypher
-
Cypher is Neo4j’s graph query language that lets you retrieve data from the graph. It is like SQL, but for graphs.
- APOC
-
Awesome Procedures On Cypher (APOC) is a library of (many) functions that can not be easily expressed in Cypher itself.
- Bolt
-
Bolt is the protocol used for interaction between Neo4j instances and drivers. It listens on port 7687 by default.
- ACID
-
Atomicity, Consistency, Isolation, Durability (ACID) are properties guaranteeing that database transactions are processed reliably. An ACID-compliant DBMS ensures that the data in the database remains accurate and consistent despite failures.
- eventual consistency
-
A database is eventually consistent if it provides the guarantee that all cluster members will, at some point in time, store the latest version of the data.
- causal consistency
-
A database is causally consistent if read and write queries are seen by every member of the cluster in the same order. This is stronger than eventual consistency.
- null
-
The null marker is not a type but a placeholder for absence of value. For more information, see Cypher Manual — Working with
null
. - transaction
-
A transaction is a unit of work that is either committed in its entirety or rolled back on failure. An example is a bank transfer: it involves multiple steps, but they must all succeed or be reverted, to avoid money being subtracted from one account but not added to the other.
Was this page helpful?