4.4. Seed a cluster

This section describes how to seed a new Neo4j Causal Cluster with existing data.

In Section 4.3, “Create a new cluster” we learned how to create a cluster with an empty store. However, regardless if we are just playing around with Neo4j or setting up a production environment, it is likely that we have some existing data that we wish to transfer into our cluster.

In this section we will describe the following methods for seeding a cluster:

This section outlines how to create a Causal Cluster containing data either seeded from an existing online or offline Neo4j database, or imported from some other data source using the import tool. The general steps to seed a cluster will follow the same pattern, regardless which format our data is in:

  1. Create a new Neo4j Core-only cluster.
  2. Seed the cluster.
  3. Start the cluster.

The database which you are using to seed the cluster must be of the same version of Neo4j as the cluster itself.

4.4.1. Seed from an online backup

For this example, it is assumed that we already have a healthy backup of an existing Neo4j database as a result of an online backup from a running Neo4j instance (for details on online backups, please refer to Chapter 6, Backup). This could be a standalone Neo4j instance, a Neo4j Highly Available cluster, or a running instance of another Neo4j Causal Cluster. The process described here can also be used to seed a new Causal Cluster from an existing Read Replica. This can be useful, for example, in disaster recovery where some servers have retained operability during a catastrophic event.

  1. Create a new Neo4j Core-only cluster.

    Follow the instructions in Section 4.3.1, “Configure a Core-only cluster” to create a new Neo4j Core-only cluster. If you start the cluster in order to test that it works, you have to subsequently stop it and perform neo4j-admin unbind on each of the instances.

  2. Seed the cluster.

    Use the restore command of neo4j-admin to restore the seeding store from the backed-up database on all the Core instances in the cluster. Since all instances are seeded with the store, the cluster will be fully available right away once the instances are started.

    Example 4.2. Restore the backup to seed all Core members.

    This example assumes that the database name is the default graph.db and that we have a valid backup residing under the seed-dir directory. If you have a different setup, change the command line arguments accordingly.

    neo4j-01$ ./bin/neo4j-admin restore --from=seed-dir --database=graph.db
    neo4j-02$ ./bin/neo4j-admin restore --from=seed-dir --database=graph.db
    neo4j-03$ ./bin/neo4j-admin restore --from=seed-dir --database=graph.db
  3. Start the cluster.

    At this point, all of the instances in the Core cluster have the store that contains our graph data. Between them the Core Servers have everything necessary to form a cluster. We are ready to start all instances. The cluster will form and data will be replicated between the instances.

    Example 4.3. Start each of the Core instances.
    neo4j-01$ ./bin/neo4j start
    neo4j-02$ ./bin/neo4j start
    neo4j-03$ ./bin/neo4j start

4.4.2. Seed from an offline backup

There are cases where we may want to seed a database in an offline fashion, for example if we are upgrading from Neo4j Community to Enterprise, or if we choose to transplant a database from one Enterprise site to another. To handle offline backups, we use the dump and load commands of neo4j-admin. For more detailed instructions on these, please refer to Section 11.7, “Dump and load databases”.

The overall process for seeding from an offline backup is the same as for an online backup, with the difference that the backup must be restored onto all the Core members.

  1. Create a new Neo4j Core-only cluster.

    Follow the instructions in Section 4.3.1, “Configure a Core-only cluster” to create a new Neo4j Core-only cluster. If you start the cluster in order to test that it works, you have to subsequently stop it and perform neo4j-admin unbind on each of the instances.

  2. Seed the cluster.

    Seed the cluster by loading the dump file into each of the newly created Core member using the load command of neo4j-admin.

    Example 4.4. Load the database into each Cluster member

    In this example we assume that we have an offline backup of your Neo4j database as a result of using the dump command of neo4j-admin. The database name is the default graph.db and we have a dump file called graph.dump in the seed-dir directory. If you have a different setup, change the command line arguments accordingly.

    neo4j-01$ ./bin/neo4j-admin load --from=seed-dir/graph.dump --database=graph.db
    neo4j-02$ ./bin/neo4j-admin load --from=seed-dir/graph.dump --database=graph.db
    neo4j-03$ ./bin/neo4j-admin load --from=seed-dir/graph.dump --database=graph.db
  3. Start the cluster.

    At this point all the instances of the Core cluster have the store that contains our graph data. We are ready to start all instances the same way that is illustrated in Example 4.3, “Start each of the Core instances.”.

4.4.3. Seed using the import tool

In the case that we wish to create a cluster based on imported data, we follow a procedure that is very similar to that of seeding from an offline backup.

  1. Create a new Neo4j Core-only cluster.

    Follow the instructions in Section 4.3.1, “Configure a Core-only cluster” to create a new Neo4j Core-only cluster. If you start the cluster in order to test that it works, you have to subsequently stop it and perform neo4j-admin unbind on each of the instances.

  2. Seed the cluster.

    Seed the cluster by loading imported data into each of the newly created Core members using the import tool.

  3. Start the cluster.

    At this point all the instances of the Core cluster have the store that contains our graph data. We are ready to start all instances the same way that is illustrated in Example 4.3, “Start each of the Core instances.”.

Using copy-and-paste to move the internal data directory, in order to transfer and seed databases is not supported. If you have an existing Neo4j database whose data you wish to use for a new cluster, follow one of the processes described in this section.