4.2.4. Seed a Causal Cluster

This section describes how to seed a new Neo4j Causal Cluster with existing data.

In Section 4.2.3, “Create a new Causal Cluster” we learned how to create a cluster with an empty store. However, regardless if we are just playing around with Neo4j or setting up a production environment, it is likely that we have some existing data that we wish to transfer into our cluster. In this section we will describe three different methods for seeding a cluster:

Using copy-and-paste to move the internal data directory, in order to transfer and seed databases is not supported. If you have an existing Neo4j database whose data you wish to use for a new cluster, follow one of the processes described in this section.

This section outlines how to create a Causal Cluster containing data either seeded from an existing online or offline Neo4j database, or imported from some other data source using the import tool. The general steps to seed a cluster will follow the same pattern, regardless which format our data is in:

  1. Create a new Neo4j Core-only cluster
  2. Seed the cluster
  3. Start the cluster

4.2.4.1. Seed from an online backup

For this example, it is assumed that we already have a healthy backup of an existing Neo4j database as a result of an online backup from a running Neo4j instance (for details on online backups, please refer to Chapter 6, Backup). This could be a standalone Neo4j instance, a Neo4j Highly Available cluster, or a running instance of another Neo4j Causal Cluster. The process described here can also be used to seed a new Causal Cluster from an existing Read Replica, or from a Core server which has been unbound from its cluster. This can be useful, for example, in disaster recovery where some servers have retained operability during a catastrophic event.

Create a new Neo4j Core-only cluster
Create a new Neo4j Core-only cluster, as described in Section 4.2.3.1, “Configure a Core-only cluster”. Do not yet start the instances that make up the cluster, as we must first restore the database contents.
Seed the cluster
Use the restore command of neo4j-admin to restore the seeding store from the backed-up database on all the Core instances in the cluster. Since all instances are seeded with the store, the cluster will be fully available right away once the instances are started.
Example 4.2. Restore the backup to seed all Core members

This example assumes that the database name is the default graph.db and that we have a valid backup residing under the seed-dir directory. If you have a different setup, change the command line arguments accordingly.

neo4j-01$ ./bin/neo4j-admin restore --from=seed-dir --database=graph.db
neo4j-02$ ./bin/neo4j-admin restore --from=seed-dir --database=graph.db
neo4j-03$ ./bin/neo4j-admin restore --from=seed-dir --database=graph.db
Start the cluster
At this point, all of the instances in the Core cluster have the store that contains our graph data. Between them the Core servers have everything necessary to form a cluster. We are ready to start all instances. The cluster will form and data will be replicated between the instances.
Example 4.3. Start each of the Core instances
neo4j-01$ ./bin/neo4j start
neo4j-02$ ./bin/neo4j start
neo4j-03$ ./bin/neo4j start

4.2.4.2. Seed from an offline backup

There are cases where we may want to seed a database in an offline fashion, for example if we are upgrading from Neo4j Community to Enterprise, or if we choose to transplant a database from one Enterprise site to another. To handle offline backups, we use the dump and load commands of neo4j-admin. For more detailed instructions on these, please refer to Section 10.3, “Dump and load databases”.

The overall process for seeding from an offline backup is the same as for an online backup, with the difference that the backup must be restored onto all the Core members.

Create a new Neo4j Core-only cluster
Create a new Neo4j Core-only cluster, as described in Section 4.2.3.1, “Configure a Core-only cluster”. Do not yet start the instances that make up the cluster, as we must first load the database contents.
Seed the cluster
Seed the cluster by loading the dump file into each of the newly created Core member using the load command of neo4j-admin.
Example 4.4. Load the database into each Cluster member

In this example we assume that we have an offline backup of your Neo4j database as a result of using the dump command of neo4j-admin. The database name is the default graph.db and we have a dump file called graph.dump in the seed-dir directory. If you have a different setup, change the command line arguments accordingly.

neo4j-01$ ./bin/neo4j-admin load --from=seed-dir/graph.dump --database=graph.db
neo4j-02$ ./bin/neo4j-admin load --from=seed-dir/graph.dump --database=graph.db
neo4j-03$ ./bin/neo4j-admin load --from=seed-dir/graph.dump --database=graph.db
Start the cluster
At this point all the instances of the Core cluster have the store that contains our graph data. We are ready to start all instances the same way that is illustrated in Example 4.3, “Start each of the Core instances”.

4.2.4.3. Seed using the import tool

In the case that we wish to create a cluster based on imported data, we follow a procedure that is very similar to that of seeding from an offline backup.

Create a new Neo4j Core-only cluster
Create a new Neo4j Core-only cluster, as described in Section 4.2.3.1, “Configure a Core-only cluster”. Do not yet start the instances that make up the cluster, as we must first import the database contents.
Seed the cluster
Seed the cluster by loading importing data into each of the newly created Core members using the import tool.
Start the cluster
At this point all the instances of the Core cluster have the store that contains our graph data. We are ready to start all instances the same way that is illustrated in Example 4.3, “Start each of the Core instances”.

If the cluster does not form as expected, the logs will contain sufficient information for the operator to determine the problem.