B.1. Set up a local Causal Cluster

This tutorial walks through the basics of setting up a Neo4j Causal Cluster. The result is a local cluster of six instances: three Cores and three Read Replicas.

This tutorial describes the following:

Introduction

In this tutorial we will learn how to deploy a Causal Cluster locally, on a single machine. This is a useful learning exercise, and will get you started quickly with developing an application against a Neo4j Causal Cluster. Please note that in practice, a cluster on a single machine has no fault tolerance and is therefore not suitable for production use.

We will begin by configuring and starting a cluster of three Core instances. This is the minimal deployment for a fault-tolerant Causal Cluster (for a discussion on the number of servers required for a Causal Cluster, see Section 5.1.2.1, “Core Servers”). The Core instances are responsible for keeping the data safe.

After the Core of the cluster is operational we will add three Read Replicas. The Read Replicas are responsible for scaling the capacity of the cluster.

The Core of the Causal Cluster is intended to remain stable over time. The roles within the Core will change as needed, but the Core itself is long-lived and stable. Read Replicas live at the edge of the cluster and can be brought up and taken down without affecting the Core. They can be added as needed to increase the operational capacity of the cluster as a whole.

In this tutorial we will be running all instances in the cluster on a single machine. Many of the default configuration settings work well out of the box in a production deployment, with multiple machines. Some of these we have to change when deploying multiple instances on a single machine, so that instances do not try to use the same network ports. We call out the settings that are specific to this scenario as we go along.

Download Neo4j and configure the Core instances

  1. Create a local working directory.
  2. Download a copy of Neo4j Enterprise Edition from the Neo4j download site.
  3. Unpack Neo4j in the working directory.
  4. Make a copy of the neo4j-enterprise-3.5.5 directory, and name it core-01. We will need to keep the original directory to use when setting up the Read Replicas later. The core-01 directory will contain the first Core instance.
Configure the first Core instance

All configuration that we will do takes place in the Neo4j configuration file, conf/neo4j.conf. If you used a different package than in the download instructions above, see Section 4.2, “File locations” to locate the configuration file.

The first settings we will change represent the minimum configuration for a Core instance:

  1. Locate and uncomment the setting dbms.mode=CORE.
  2. Locate and uncomment the setting causal_clustering.minimum_core_cluster_size_at_formation=3.
  3. Locate and uncomment the setting causal_clustering.minimum_core_cluster_size_at_runtime=3.
  4. Locate and uncomment the setting causal_clustering.initial_discovery_members=localhost:5000,localhost:5001,localhost:5002.

Since we are setting up the Causal Cluster to run on a single machine, we must do some additional configuration. Please note that these steps would not be necessary if the instances are running on different servers.

  1. Locate and uncomment the setting causal_clustering.discovery_listen_address=:5000.
  2. Locate and uncomment the setting causal_clustering.transaction_listen_address=:6000.
  3. Locate and uncomment the setting causal_clustering.raft_listen_address=:7000.
  4. Locate and uncomment the setting dbms.connector.bolt.listen_address=:7687.
  5. Locate and uncomment the setting dbms.connector.http.listen_address=:7474.
  6. Locate the dbms.connector.https.listen_address setting and change the value to :6474.
  7. Locate the dbms.backup.address setting and change the value to :6362.
Configure the second Core instance

We can now create the second Core instance.

As mentioned above, we also need to amend some of the additional values in the neo4j.conf file so that our cluster can run on a single machine:

  1. Make a copy of the core-01 directory and rename it core-02.
  2. Open the neo4j.conf file in the new core-02 directory.

    1. Locate the causal_clustering.discovery_listen_address setting and change the value to :5001.
    2. Locate the causal_clustering.transaction_listen_address setting and change the value to :6001.
    3. Locate the causal_clustering.raft_listen_address setting and change the value to :7001.
    4. Locate the dbms.connector.bolt.listen_address setting and change the value to :7688.
    5. Locate the dbms.connector.http.listen_address setting and change the value to :7475.
    6. Locate the dbms.connector.https.listen_address setting and change the value to :6475.
    7. Locate the dbms.backup.address setting and change the value to :6363.
Configure the third Core instance

We can now create the third, and final Core instance.

Again, we also need to amend some of the additional values in the neo4j.conf file so that our cluster can run on a single machine:

  1. Make a copy of the core-02 directory and rename it core-03.
  2. Open the neo4j.conf file in the new core-03 directory.

    1. Locate the causal_clustering.discovery_listen_address setting and change the value to :5002.
    2. Locate the causal_clustering.transaction_listen_address setting and change the value to :6002.
    3. Locate the causal_clustering.raft_listen_address setting and change the value to :7002.
    4. Locate the dbms.connector.bolt.listen_address setting and change the value to :7689.
    5. Locate the dbms.connector.http.listen_address setting and change the value to :7476.
    6. Locate the dbms.connector.https.listen_address setting and change the value to :6476.
    7. Locate the dbms.backup.address setting and change the value to :6364.

Start the Core servers

In any order, we can now start each of the Neo4j instances:

core-01$ ./bin/neo4j start
core-02$ ./bin/neo4j start
core-03$ ./bin/neo4j start
Startup Time

If you want to follow along with the startup of a server you can follow the messages in logs/neo4j.log:

  • On a Unix system issue the command tail -f logs/neo4j.log.
  • On Windows Server run Get-Content .\logs\neo4j.log -Tail 10 -Wait.

While an instance is joining the cluster, the server may appear unavailable. In the case where an instance is joining a cluster with lots of data, it may take a number of minutes for the new instance to download the data from the cluster and become available.

Check the status of the cluster

Now the minimal cluster of three Core Servers is operational and is ready to serve requests.

  1. Connect to any of the three Core instances to check the cluster status. For example, for core-01 point your web browser to http://localhost:7474.
  2. Authenticate with the default neo4j/neo4j credentials, and set a new password when prompted. These credentials are not shared between cluster members. A new password must be set on each instance when connecting for the first time.
  3. Check the status of the cluster by running the following in Neo4j Browser:

    :sysinfo
    Example B.1. Example of a Cluster overview, achieved by running :sysinfo

    The Causal Cluster Members table shows the status of instances in the cluster. The table below is an example of a test cluster:

    Roles Addresses Actions

    LEADER

    bolt://localhost:7687, http://localhost:7474, https://localhost:6474

    Open

    FOLLOWER

    bolt://localhost:7688, http://localhost:7475, https://localhost:6475

    Open

    FOLLOWER

    bolt://localhost:7689, http://localhost:7476, https://localhost:6476

    Open

    The three Core instances in this cluster are now operational.

    Now you can run queries to create nodes and relationships, and see that the data gets replicated in the cluster.

  4. Click the Open action on the instance that has the LEADER role. This will open a new Neo4j Browser session against the Leader of the cluster.
  5. Authenticate and set a new password, as before.
  6. Run the following query to create nodes and relationships:

    UNWIND range(0, 100) AS value
    MERGE (person1:Person {id: value})
    MERGE (person2:Person {id: toInt(100.0 * rand())})
    MERGE (person1)-[:FRIENDS]->(person2)
  7. When the query has executed, choose an instance with the FOLLOWER role from the sysinfo view. Click the Open action to connect.
  8. Run the following query to see that the data has been replicated:

    MATCH path = (person:Person)-[:FRIENDS]-(friend)
    RETURN path
    LIMIT 10

Configure the Read Replicas

Read Replicas instances do not participate in quorum decisions, so their configuration is simpler than the configuration of Core Servers as there are fewer settings to amend.

All that a Read Replica needs to know is the addresses of Core Servers which they can bind to in order to discover the cluster. See Section C.1.2, “Discovery protocol” for the details of how this works. Once it has completed the initial discovery, the Read Replica becomes aware of the currently available Core Servers and can choose an appropriate one from which to catch up. See Section C.1.6, “Catchup protocol” for the details of how this works.

Configure the first Read Replica
  1. In your working directory, make a copy of the neo4j-enterprise-3.5.5 directory and name it replica-01.
  2. Open the neo4j.conf file in the new replica-01 directory. The first settings we will change represent the minimum configuration for a Read Replica:

    1. Locate and uncomment the dbms.mode setting and change the value to READ_REPLICA.
    2. Locate and uncomment the setting causal_clustering.initial_discovery_members=localhost:5000,localhost:5001,localhost:5002.
  3. Since we are setting up the Causal Cluster to run on a single machine, we must do some additional configuration. Please note that the following steps would not be necessary if the instances are running on different servers:

    1. Locate and uncomment the causal_clustering.transaction_listen_address setting and change the value to :6003.
    2. Locate and uncomment the dbms.connector.bolt.listen_address setting and change the value to :7690.
    3. Locate and uncomment the dbms.connector.http.listen_address setting and change the value to :7477.
    4. Locate and uncomment the dbms.connector.https.listen_address setting and change the value to :6477.
    5. Locate and uncomment the dbms.backup.address setting and change the value to :6365.
Configure the second Read Replica

We can now create the second Read Replica.

As mentioned above, we also need to amend some of the additional values in the neo4j.conf file so that our cluster can run on a single machine:

  1. Make a copy of the replica-01 directory and rename it replica-02.
  2. Open the neo4j.conf file in the new replica-02 directory.

    1. Locate the causal_clustering.transaction_listen_address setting and change the value to :6004.
    2. Locate the dbms.connector.bolt.listen_address setting and change the value to :7691.
    3. Locate the dbms.connector.http.listen_address setting and change the value to :7478.
    4. Locate the dbms.connector.https.listen_address setting and change the value to :6478.
    5. Locate the dbms.backup.address setting and change the value to :6366.
Configure the third Read Replica

We can now create the third, and final Read Replica.

Again, we also need to amend some of the additional values in the neo4j.conf file so that our cluster can run on a single machine:

  1. Make a copy of the replica-02 directory and rename it replica-03.
  2. Open the neo4j.conf file in the new replica-03 directory.

    1. Locate the causal_clustering.transaction_listen_address setting and change the value to :6005.
    2. Locate the dbms.connector.bolt.listen_address setting and change the value to :7692.
    3. Locate the dbms.connector.http.listen_address setting and change the value to :7479.
    4. Locate the dbms.connector.https.listen_address setting and change the value to :6479.
    5. Locate the dbms.backup.address setting and change the value to :6367.

Start the Read Replicas

In any order, we can now start the Read Replica instances:

replica-01$ ./bin/neo4j start
replica-02$ ./bin/neo4j start
replica-03$ ./bin/neo4j start

Test the cluster with Read Replicas

To test the status of the cluster now that the Read Replicas are running, we will repeat the steps from earlier, but via a Read Replica:

  1. Connect to any of the three Read Replica instances. For example, for replica-01 point your web browser to http://localhost:7477.
  2. Authenticate with the default neo4j/neo4j credentials. Once again, you will need to set a new password when prompted.
  3. Check the status of the cluster by running the following in Neo4j Browser:

    :sysinfo
    Example B.2. Example of a cluster with both Core instances and Read Replicas, achieved by running :sysinfo

    The following table shows the status of a test cluster which now includes Read Replicas:

    Roles Addresses Actions

    LEADER

    bolt://localhost:7687, http://localhost:7474, https://localhost:6474

    Open

    FOLLOWER

    bolt://localhost:7688, http://localhost:7475, https://localhost:6475

    Open

    FOLLOWER

    bolt://localhost:7689, http://localhost:7476, https://localhost:6476

    Open

    READ_REPLICA

    bolt://localhost:7690, http://localhost:7477, https://localhost:6477

    Open

    READ_REPLICA

    bolt://localhost:7691, http://localhost:7478, https://localhost:6478

    Open

    READ_REPLICA

    bolt://localhost:7692, http://localhost:7479, https://localhost:6479

    Open

  4. Click the Open action to connect to any of the Read Replicas.
  5. Run the same query as before:

    MATCH path = (person:Person)-[:FRIENDS]-(friend)
    RETURN path
    LIMIT 10