B.1. Set up a local Causal Cluster

This tutorial walks through the basics of setting up a Neo4j Causal Cluster. The result is a local cluster of six instances: three Cores and three Read replicas.

In this section we will learn how to deploy a Causal Cluster locally, on a single machine. This is useful as a learning exercise and to get started quickly developing an application against a Neo4j Causal Cluster. A cluster on a single machine has no fault tolerance and is not suitable for production use.

We will begin by configuring and starting a cluster of three Core instances. This is the minimal deployment of a fault-tolerant Causal Cluster. The Core instances are responsible for keeping the data safe.

After the Core of the cluster is operational we will add three Read Replicas. The Read Replicas are responsible for scaling the capacity of the cluster.

The core of the Causal Cluster remains stable over time. The roles within the core will change as needed but the core itself is long-lived and stable. At the edge of the cluster, the Read Replicas are cheap and disposable. They can be added as needed to increase the operational capacity of the cluster as a whole.

B.1.1. Download and configure

Some of the configuration for the instances in the cluster will be identical. A convenient way to go about the configuration is therefore:

  1. Create a local working directory.
  2. Download a copy of Neo4j Enterprise Edition from the Neo4j download site.
  3. Unpack Neo4j in the working directory.
  4. Make a copy of the neo4j-enterprise-3.3.5 directory and name it core-01/ or similar. Keep the original directory to use when setting up the Read Replicas later. The core-01/ directory will contain the first Core instance.
  5. Complete the configuration (see below) for the first Core instance. Then make two copies of the core-01/ directory and name them core-02/ and core-03/.
  6. Proceed with the configuration of the two copied instances. Those changes that are common to all three Core instances are now already in place for Cores number two and three.

In this example we are running all instances in the cluster on a single machine. Many of the default configuration settings work well out of the box in a production deployment, with multiple machines. Some of these we have to change when deploying multiple instances on a single machine, so that instances do not try to use the same network ports. We call out the settings that are specific to this scenario as we go along.

B.1.2. Configure the Core instances

All configuration that we will do takes place in the Neo4j configuration file, conf/neo4j.conf. If you used a different package than in the download instructions above, see Section 3.1, “File locations” to locate the configuration file. Look in the configuration file for a section labeled "Causal Clustering Configuration".

B.1.2.1. Minimum configuration

The minimum configuration for a Core instance requires setting the following:

dbms.mode
The operating mode of this instance. Uncomment this setting and give it the value CORE.
causal_clustering.expected_core_cluster_size
The number of Core instances in the cluster. Uncomment this setting and give it the value 3.
causal_clustering.initial_discovery_members
The network addresses of Core cluster members to be used to discover the cluster when this instance joins. Uncomment this setting and give it the value localhost:5000,localhost:5001,localhost:5002.

B.1.2.2. Additional configuration

Since we are setting up the Causal Cluster to run on a single machine, we must do some additional configuration that is not necessary when the instances are running on different servers. We configure the following:

causal_clustering.discovery_listen_address
The port used for discovery between machines. Uncomment this setting and give it the value :5000. On the other two instances, give them the values :5001 and :5002, respectively.
causal_clustering.transaction_listen_address
The internal transaction communication address. Uncomment this setting and give it the value :6000. On the other two instances, give them the values :6001 and :6002, respectively.
causal_clustering.raft_listen_address
The internal consensus mechanism address. Uncomment this setting and give it the value :7000 On the other two instances give them the values :7001 and :7002, respectively.
dbms.connector.bolt.listen_address
The Bolt connector address. Uncomment this setting and give it the value :7687. On the other two instances give them the values :7688 and :7689, respectively.
dbms.connector.http.listen_address
The HTTP connector address. Uncomment this setting and give it the value :7474. On the other two instances give them the values :7475 and :7476, respectively.
dbms.connector.https.listen_address
The HTTPS connector address. Uncomment this setting and give it the value :6474. On the other two instances give them the values :6475 and :6476, respectively.

B.1.3. Start the Neo4j servers

Start each Neo4j instance as usual. The startup order does not matter.

core-01$ ./bin/neo4j start
core-02$ ./bin/neo4j start
core-03$ ./bin/neo4j start
Startup Time

If you want to follow along with the startup of a server you can follow the messages in logs/neo4j.log. On a Unix system issue the command tail -f logs/neo4j.log. On Windows Server run Get-Content .\logs\neo4j.log -Tail 10 -Wait. While an instance is joining the cluster, the server may appear unavailable. In the case where an instance is joining a cluster with lots of data, it may take a number of minutes for the new instance to download the data from the cluster and become available.

B.1.4. Check the status of the cluster

Now the minimal cluster of three Core Servers is operational and ready to serve requests. Connect to either of the three instances to check the cluster status. Point your web browser to http://localhost:7474. Authenticate with the default neo4j/neo4j and set a new password. These credentials are not shared between cluster members. A new password must be set on each instance when connecting for the first time. For production deployment we advise integrating the Neo4j cluster with your directory service. See Section 7.1.5, “Integration with LDAP” for more details.

Once you have authenticated you can check the status of the cluster by running the query: CALL dbms.cluster.overview(). The output will look similar to the following.

Table B.1. Cluster overview with dbms.cluster.overview()
id addresses role

08eb9305-53b9-4394-9237-0f0d63bb05d5

[bolt://localhost:7687, http://localhost:7474, https://localhost:6474]

LEADER

cb0c729d-233c-452f-8f06-f2553e08f149

[bolt://localhost:7688, http://localhost:7475, https://localhost:6475]

FOLLOWER

ded9eed2-dd3a-4574-bc08-6a569f91ec5c

[bolt://localhost:7689, http://localhost:7476, https://localhost:6476]

FOLLOWER

The three Core instances in the cluster are operational.

B.1.5. Test the cluster

Now you can run queries to create nodes and relationships, and see that the data gets replicated in the cluster.

When developing an application against a Neo4j Causal Cluster it is not necessary to know about the roles of the cluster members. The Neo4j Bolt driver creates sessions with access mode READ or WRITE on request. It is the driver’s responsibility to identify the best cluster member to service the session according to need.

When connecting directly with Neo4j Browser, however, we need to be more aware of the roles that the cluster members have. It is easy to navigate from member to member by running the :sysinfo command. The sysinfo view contains information about the Neo4j instance. If the instance participates in a Causal Clustering cluster then this view contains a table: Causal Clustering Cluster Members. This table contains the same information as provided by the dbms.cluster.overview() procedure, but here you can also take action on the other members of the cluster. Run the :sysinfo command and click the Open action on the instance that has the LEADER role. This opens a new Browser session against the Leader of the cluster.

Authenticate and set a new password, as before. Now you can run a query to create nodes and relationships.

UNWIND range(0, 100) AS value
MERGE (person1:Person {id: value})
MERGE (person2:Person {id: toInt(100.0 * rand())})
MERGE (person1)-[:FRIENDS]->(person2)

When the query has executed choose an instance with the FOLLOWER role from the sysinfo view. Click the Open action to connect. Now you can run a query to see that the data has been replicated.

MATCH path = (person:Person)-[:FRIENDS]-(friend)
RETURN path
LIMIT 10

B.1.6. Configure the Read Replicas

Setting up the Read Replicas is similar to setting up the Cores, but simpler.

  1. In your working directory, rename the neo4j-enterprise-3.3.5 directory and name it replica-01/ or similar. The replica-01/ directory will contain the first Read replica.
  2. Complete the configuration (see below) for the first Core instance. Then make two copies of the replica-01/ directory and name them replica-02/ and replica-03/.
  3. Proceed with the configuration of the two copied instances. Those changes that are common to all three Read Replicas are now already in place for replicas number two and three.

Configuring a Read replica is similar to configuring a Core. Read Replicas instances do not participate in quorum decisions, so their configuration is simpler. All that a Read replica needs to know is the addresses of Core Servers which they can bind to in order to discover the cluster. See Section 4.2.2.1, “Discovery protocol” for details. Once it has completed the initial discovery, the Read replica becomes aware of the currently available Core Servers and can choose an appropriate one from which to catch up. See Section 4.2.2.5, “Catchup protocol” for details.

B.1.6.1. Minimum configuration

The minimum configuration for a Read Replica instance requires setting the following:

dbms.mode
The operating mode of this instance. Uncomment this setting and give it the value READ_REPLICA.
causal_clustering.initial_discovery_members
The network addresses of Core cluster members to be used to discover the cluster when this instance joins. Uncomment this setting and give it the value localhost:5000,localhost:5001,localhost:5002.

B.1.6.2. Additional configuration

Similarly to the Core Cluster, we have to do some additional configuration to enable all the Read Replicas to run on the same machine. We configure the following:

causal_clustering.transaction_listen_address
The internal transaction communication address. Uncomment this setting and give it the value :6003. On the other two instances, give them the values :6004 and :6005, respectively.
dbms.connector.bolt.listen_address
The Bolt connector address. Uncomment this setting and give it the value :7690. On the other two instances, give them the values :7691 and :7692, respectively.
dbms.connector.http.listen_address
The HTTP connector address. Uncomment this setting and give it the value :7477. On the other two instances, give them the values :7478 and :7479, respectively.
dbms.connector.https.listen_address
The HTTPS connector address. Uncomment this setting and give it the value :6477. On the other two instances, give them the values :6478 and :6479, respectively.

B.1.7. Test the cluster with Read Replicas

Connect to any of the instances and run CALL dbms.cluster.overview() to see the new overview of the cluster. With Read Replicas added the overview will look similar to:

Table B.2. Cluster overview with dbms.cluster.overview()
id addresses role

08eb9305-53b9-4394-9237-0f0d63bb05d5

[bolt://localhost:7687, http://localhost:7474, https://localhost:6474]

LEADER

cb0c729d-233c-452f-8f06-f2553e08f149

[bolt://localhost:7688, http://localhost:7475, https://localhost:6475]

FOLLOWER

ded9eed2-dd3a-4574-bc08-6a569f91ec5c

[bolt://localhost:7689, http://localhost:7476, https://localhost:6476]

FOLLOWER

00000000-0000-0000-0000-000000000000

[bolt://localhost:7690, http://localhost:7477, https://localhost:6477]

READ_REPLICA

00000000-0000-0000-0000-000000000000

[bolt://localhost:7691, http://localhost:7478, https://localhost:6478]

READ_REPLICA

00000000-0000-0000-0000-000000000000

[bolt://localhost:7692, http://localhost:7479, https://localhost:6479]

READ_REPLICA

To test that the Read Replicas have successfully caught up with the cluster use :sysinfo and click the Open action to connect to one of the Read Replicas. Issue the same query as before:

MATCH path = (person:Person)-[:FRIENDS]-(friend)
RETURN path
LIMIT 10