7.3. Seed a cluster

This section describes how to seed a new Neo4j Causal Cluster with existing data.

This section includes:

7.3.1. Introduction

In Section 7.2, “Deploy a cluster” we learned how to create a cluster with empty databases. However, regardless of whether we are just playing around with Neo4j or setting up a production environment, it is likely that we have some existing data that we wish to transfer into our cluster.

This section outlines how to create a Causal Cluster containing data either seeded from an existing online or offline Neo4j database, or imported from some other data source using the import tool. The general steps to seed a cluster will follow the same pattern, regardless of which format our data is in:

  1. Create a new Neo4j Core-only cluster.
  2. Seed the cluster.
  3. Start the cluster.

The databases which you are using to seed the cluster must be of the same version of Neo4j as the cluster itself.

7.3.2. Seed from backups

For this section, it is assumed that we already have healthy backups of an existing Neo4j deployment. This could be online or offline backups from a standalone Neo4j instance or a Neo4j Causal Cluster. For details on performing online backups, please refer to Chapter 9, Backup.

Moving files and directories manually in or out of a Neo4j installation is not recommended and considered unsupported usage. If you have an existing Neo4j database which you wish to use for a new cluster, then use neo4j-admin dump to create an offline backup.

The process described here can also be used to seed a new Causal Cluster from an existing Read Replica. This can be useful, for example, in disaster recovery where some servers have retained their data during a catastrophic event.

  1. Create a new Neo4j Core-only cluster.

    Follow the instructions in Section 7.2.2, “Configure a Core-only cluster” to create a new Neo4j Core-only cluster.

    You could start the cluster now in order to test that everything is correctly configured, but this will create default databases as part of cluster formation. Since you are trying to seed with a set of databases you have to subsequently stop every instance, unbind them from the cluster using neo4j-admin unbind and remove those databases so that the correct seeds can be used instead.

  2. Seed the cluster.

    Use neo4j-admin restore or neo4j-admin load to seed all the Core instances in the cluster.

    The examples assume that we are restoring one user database with the default name of neo4j in addition to the system database which contains replicated configuration state. Modify the command line arguments to match your exact setup.

    Example 7.4. Seed using neo4j-admin restore.
    neo4j-01$ ./bin/neo4j-admin restore --from=/path/to/system-backup-dir --database=system
    neo4j-01$ ./bin/neo4j-admin restore --from=/path/to/neo4j-backup-dir --database=neo4j
    neo4j-02$ ./bin/neo4j-admin restore --from=/path/to/system-backup-dir --database=system
    neo4j-02$ ./bin/neo4j-admin restore --from=/path/to/neo4j-backup-dir --database=neo4j
    neo4j-03$ ./bin/neo4j-admin restore --from=/path/to/system-backup-dir --database=system
    neo4j-03$ ./bin/neo4j-admin restore --from=/path/to/neo4j-backup-dir --database=neo4j
    Example 7.5. Seed using neo4j-admin load.
    neo4j-01$ ./bin/neo4j-admin load --from=/path/to/system.dump --database=system
    neo4j-01$ ./bin/neo4j-admin load --from=/path/to/neo4j.dump --database=neo4j
    neo4j-02$ ./bin/neo4j-admin load --from=/path/to/system.dump --database=system
    neo4j-02$ ./bin/neo4j-admin load --from=/path/to/neo4j.dump --database=neo4j
    neo4j-03$ ./bin/neo4j-admin load --from=/path/to/system.dump --database=system
    neo4j-03$ ./bin/neo4j-admin load --from=/path/to/neo4j.dump --database=neo4j
  3. Start the cluster.

    At this point, all of the instances in the Core cluster have been seeded. Between them, the Core Servers have everything necessary to form a cluster. We are ready to start all instances. The cluster will form and the replicated Neo4j DBMS deployment will come online.

    Example 7.6. Start each of the Core instances.
    neo4j-01$ ./bin/neo4j start
    neo4j-02$ ./bin/neo4j start
    neo4j-03$ ./bin/neo4j start

7.3.3. Seed using the import tool

In order to create a cluster based on imported data, it is recommended to first import the data into a standalone Neo4j DBMS and then use an offline backup to seed the cluster.

  1. Import the data.

    1. Deploy a standalone Neo4j DBMS.
    2. Import the data using the import tool.
  2. Use neo4j-admin dump to create an offline backup of the neo4j database.
  3. Seed a new cluster using the instructions in Section 7.3.2, “Seed from backups”.

    Skip the system database in this scenario since it is not needed.