Knowledge Base

Bulk Import / Backups into running 4.0 instances

Neo4j 4.0 allows for multiple running databases. You can use neo4j-admin import or neo4j-admin restore to import or restore a database into a new database on a running 4.0 instance.

Neo4j-Admin Import Standalone Server

  1. Run the Neo4j-Admin import to create the database. In this example, we are running a bulk import into a new database called dataload1. Node labels are STORE.

    $ ./bin/neo4j-admin import --database=dataload1 --nodes=:STORE="/home/ubuntu/tmp/header.csv,/home/ubuntu/tmp/nodes.csv" --skip-duplicate-nodes=true --high-io=true
  2. Log into Neo4j 4.0

    neo4j> :use system
    neo4j> create database dataload1
    neo4j> show databases
    neo4j> :use dataload1
    neo4j> MATCH (n:STORE) return n limit 3;
  3. Run a second Neo4j-Admin import to create another database. In this example, we are running a bulk import into a new database called dataload2. Node labels are STORE.

    $./bin/neo4j-admin import --database=dataload2 --nodes=:STORE="/home/ubuntu/tmp/header.csv,/home/ubuntu/tmp/nodes.csv" --skip-duplicate-nodes=true --high-io=true
  4. Log into Neo4j 4.0

    neo4j> :use system
    neo4j> create database dataload2
    neo4j> show databases
    neo4j> :use dataload2
    neo4j> MATCH (n:STORE) return n limit 3;

Neo4j-Admin Import Cluster

  1. Run the Neo4j-Admin import to create the database. In this example, we are running a bulk import into a new database called dataload1. Node labels are STORE. Please note that you don’t need to stop your cluster to run the import tool, but the import tool will consume memory and cpu resources which may not be available on a server already running a Neo4j instance.

    $ ./bin/neo4j-admin import --database=dataload1 --nodes=:STORE="/home/ubuntu/tmp/header.csv,/home/ubuntu/tmp/nodes.csv" --skip-duplicate-nodes=true --high-io=true
  2. Copy the store created by the import tool to all core instances. You’ll need to copy the content of both <neo4j-home>/data/databases/dataload1 & <neo4j-home>/data/transactions/dataload1.

  3. Run CREATE DATABASE dataload1 (on Neo4j 4.0 that command would need to run against the system database, on the instance that is currently leader for that database). Once the database is created on the cluster leader, it will be propagated to the other cluster members.

    neo4j> :use system
    neo4j> create database dataload1
    neo4j> show databases
    neo4j> :use dataload1
    neo4j> MATCH (n:STORE) return n limit 3;

Neo4j-Admin Restore

In this example, we will backup the dataload1 database and use it to create a new dataload3 database.

$ ./bin/neo4j-admin backup --backup-dir=/home/ubuntu/tmp/backups --database=dataload1
$ ./bin/neo4j-admin restore --from=/home/ubuntu/tmp/backups/dataload1 --database=dataload3
neo4j> :use system
neo4j> create database dataload3
neo4j> show databases
neo4j> :use dataload3
neo4j> MATCH (n:STORE) return n limit 3;

If you want to replace an existing 4.0 database using neo4j-admin restore, you would run the following:

neo4j> :use system
neo4j> stop database dataload2
$ ./bin/neo4j-admin restore --from=/home/ubuntu/tmp/backups/dataload1 --database=dataload2 --force
neo4j> start database dataload2
neo4j> MATCH (n:STORE) return n limit 3;