9.2. Perform a backup

This section describes how to perform an online backup of a Neo4j database.

Remember to back up all of your created databases, including the system database.

This section includes:

9.2.1. Backup command

A Neo4j database can be backed up in online mode using the backup command of neo4j-admin. The machine that runs the backup command must have Neo4j installed, but does not need to run a Neo4j server. Note that it is not recommended to use an NFS mount for backup purposes as this is likely to corrupt and slow down the backup.

Syntax

neo4j-admin backup --backup-dir=<path>
                   [--verbose]
                   [--from=<host:port>]
                   [--database=<database>]
                   [--fallback-to-full]
                   [--pagecache=<size>]
                   [--check-consistency]
                   [--check-graph=<true/false>]
                   [--check-indexes=<true/false>]
                   [--check-label-scan-store=<true/false>]
                   [--check-property-owners=<true/false>]
                   [--report-dir=<path>]
                   [--additional-config=<path>]

Options

Option Default Description

--backup-dir

 

Directory to place backup in.

--verbose

false

Enable verbose output.

--from

localhost:6362

Host and port of Neo4j.

--database

neo4j

Name of the database to back up. If a backup of the specified database exists in the target directory, then an incremental backup will be attempted.

--fallback-to-full

true

If an incremental backup fails backup will move the old backup to <name>.err.<N> and fallback to a full backup instead.

--pagecache

8M

The size of the page cache to use for the backup process.

--check-consistency

true

If a consistency check should be made.

--check-graph

true

Perform checks between nodes, relationships, properties, types and tokens.

--check-indexes

true

Perform checks on indexes.

--check-label-scan-store

true

Perform checks on the label scan store.

--check-property-owners

false

Perform additional checks on property ownership. This check is very expensive in time and memory.

--report-dir

.

Directory where consistency report will be written.

--additional-config

 

Configuration file to supply additional configuration in.

Exit codes

neo4j-admin backup will exit with different codes depending on success or error. In the case of error, this includes details of what error was encountered.

Table 9.3. Neo4j Admin backup exit codes
Code Description

0

Success.

1

Backup failed.

2

Backup succeeded but consistency check failed.

3

Backup succeeded but consistency check found inconsistencies.

9.2.2. Backup process

The backup client can operate in two slightly different modes referred to as performing a full backup or an incremental backup. A full backup is always required initially for the very first backup into a target location. Subsequent backups will attempt to use the incremental mode where just the delta of the transcation logs since the last backup are transferred and applied onto the target location. If the required transaction logs aren’t available on the backup server then the backup client will fallback to performing a full backup instead, unless --fallback-to-full is disabled.

After the backup has been successfully performed the consistency checker will be invoked by default. Checking the consistency of the backup is a major operation which can consume significant computational resources, e.g. memory, CPU, I/O.

It is strongly discouraged to run the backup client on a live Neo4j server, especially together with a consistency check. Doing so can adversely affect the server.

To avoid adversely affecting a running server with the resource demands of the backup client it is recommended to take the backup and perform the consistency check on a dedicated machine which has sufficient free resources to perform the consistency check. Another alternative is to decouple the backup operation from the consistency checking and schedule that part of the workflow to happen at a later point in time on a dedicated machine. The value of consistency checking a backup should not be underestimated as it is vital for safe guarding and ensuring the quality of your data.

The transaction log files in the backup are rotated and pruned based on the provided configuration. For example, setting dbms.tx_log.rotation.retention_policy=3 files will keep 3 transaction log files in the backup. You can use the --additional-config parameter to override this configuration.

Example 9.1. Full backup

In this example, set environment variables in order to control memory usage.

The page cache is defined by using the command line option --pagecache. Further, the HEAP_SIZE environment variable will specify the maximum heap size allocated to the backup process.

Now you can perform a full backup:

$neo4j-home> export HEAP_SIZE=2G
$neo4j-home> mkdir /mnt/backups
$neo4j-home> bin/neo4j-admin backup --from=192.168.1.34 --backup-dir=/mnt/backups/neo4j --database=neo4j --pagecache=4G
Doing full backup...
2017-02-01 14:09:09.510+0000 INFO  [o.n.c.s.StoreCopyClient] Copying neostore.nodestore.db.labels
2017-02-01 14:09:09.537+0000 INFO  [o.n.c.s.StoreCopyClient] Copied neostore.nodestore.db.labels 8.00 kB
2017-02-01 14:09:09.538+0000 INFO  [o.n.c.s.StoreCopyClient] Copying neostore.nodestore.db
2017-02-01 14:09:09.540+0000 INFO  [o.n.c.s.StoreCopyClient] Copied neostore.nodestore.db 16.00 kB
...
...
...

If you do a directory listing of /mnt/backups you will now see that you have a backup in a directory called neo4j.

Example 9.2. Incremental backup

This example assumes that you have performed a full backup as per the previous example. In the same way as before, make sure to control the memory usage.

To perform an incremental backup you need to specify the location of your previous backup:

$neo4j-home> export HEAP_SIZE=2G
$neo4j-home> bin/neo4j-admin backup --from=192.168.1.34 --backup-dir=/mnt/backups/neo4j --database=neo4j --pagecache=4G
Destination is not empty, doing incremental backup...
Backup complete.

9.2.3. Memory configuration

The following options are available for configuring the memory allocated to the backup client:

Configure heap size for the backup
This is done by setting the environment variable HEAP_SIZE before starting the backup program. If not specified by HEAP_SIZE, the Java Virtual Machine will choose a value based on server resources. HEAP_SIZE configures the maximum heap size allocated for the backup process.
Configure page cache for the backup
The page cache size can be determined for the backup program by using the --pagecache option to the neo4j-admin backup command. If not explicitly defined, the page cache will default to 8MB.