Using with Neo4j Causal Cluster
|The Neo4j Streams Plugin running inside the Neo4j database is deprecated and will not be supported after version 4.3 of Neo4j. We recommend users not to adopt this plugin for new implementations, and to consider migrating to the use of the Kafka Connect Neo4j Connector as a replacement|
Neo4j Clustering is a feature available in Enterprise Edition which allows high availability of the database through having multiple database members.
Neo4j Enterprise uses a LEADER/FOLLOWER operational view, where writes are always processed by the leader, while reads can be serviced by either followers, or optionally be read replicas, which maintain a copy of the database and serve to scale out read operations horizontally.
When using Neo4j Streams in this method with a Neo4j cluster, the most important consideration is to use
a routing driver. Generally, these are identified by a URI with
neo4j:// as the scheme instead of
bolt:// as the scheme.
If your Neo4j cluster is located at
graph.mycompany.com:7687, simply configure the Kafka Connect worker with
The use of the
neo4j driver will mean that the Neo4j Driver itself will handle connecting to
the correct cluster member, and managing changes to the cluster membership over time.
For further information on routing drivers, see the Neo4j Driver Manual.
When using Neo4j Streams as a plugin together with a cluster, there are several things to keep in mind:
The plugin must be present in the plugins directory of all cluster members, and not just one.
The configuration settings must be present in:
all neo4j.conf files for versions prior of
4.0.7, and not just one;
all streams.conf files for versions since
4.0.7, and not just one.
Through the course of the cluster lifecycle, the leader may change; for this reason the plugin must be everywhere, and not just on the leader.
The plugin detects the leader, and will not attempt to perform a write (i.e. in the case of the consumer) on a follower where it would fail. The plugin checks cluster toplogy as needed.
Additionally for CDC, a consideration to keep in mind is that as of Neo4j 3.5, committed transactions are only published on the leader as well. In practical terms, this means that as new data is committed to Neo4j, it is the leader that will be publishing that data back out to Kafka, if you have the producer configured.
The neo4j streams utility procedures, in particular
CALL streams.publish, can work on any cluster member, or
CALL streams.consume may also be used on any cluster member, however it is important to keep in
mind that due to the way clustering in Neo4j works, using
streams.consume together with write operations will
not work on a cluster follower or read replica, as only the leader can process writes.
Sometimes there will be remote applications that talk to Neo4j via official drivers, that want to use streams functionality. Best practices in these cases are:
Always use a
neo4j://driver URI when communicating with the cluster in the client application.
Use Explicit Write Transactions in your client application when using procedure calls such as
CALL streams.consumeto ensure that the routing driver routes the query to the leader.
Was this page helpful?