This section introduces how to prepare for backing up a Neo4j database.
This section includes:
Designing an appropriate backup strategy for your Neo4j database is a fundamental part of database operations. The backup strategy should take into account elements such as:
The backup strategy will answer question such as:
With what frequency should we perform backups;
If using online backups:
Online backups are typically required for production environments, but it is also possible to perform offline backups.
Offline backups are a more limited method for backing up a database. For example:
For more details about offline backups, see Section 12.7, “Dump and load databases”.
The remainder of this chapter is dedicated to describing online backups.
For any backup it is important that you store your data separately from the production system, where there are no common dependencies, and preferably off-site. If you are running Neo4j in the cloud, you should use a different availability zone within the same cloud, or use a separate cloud for backups.
Since backups are kept for a long time, the longevity of archival storage should be considered as part of backup planning.
You may also want to override the settings used for pruning and rotating transaction log files. The transaction log files are files that keep track of recent changes. Recovery from backups with the same transaction log files as the source server can be helpful, but it isn’t always necessary. Please note that removing transactions manually can result in a broken backup.
Recovered servers do not need all of the transaction log files that have already been applied, so it is possible to reduce storage size even further by reducing the size of the files to the bare minimum.
This can be done by setting
dbms.tx_log.rotation.retention_policy=3 files in either
the default backup configuration (
$NEO4J_HOME/conf/neo4j.conf), or in the
$NEO4J_CONF config file.
Alternatively you can use the
The backup client can use two different protocols:
Since the backup client is not aware of, ahead of time, what type of server it will run the backup against, it will at first
attempt the catchup protocol.
If that does not succeed, it will try the common backup protocol.
If you want to control this behavior, you can use the
--protocol option when performing a backup.
The following options are available for controlling memory allocation to the backup client:
HEAP_SIZEbefore starting the backup program. If not specified by
HEAP_SIZE, the Java Virtual Machine will choose a value based on server resources.
HEAP_SIZEconfigures the maximum heap size allocated for the backup process.
--pagecacheoption to the
neo4j-admin backupcommand. If not explicitly defined, the page cache will default to 8MB.
The files listed below are not included in online and offline backups. Make sure to back them up separately.
If you have a cluster, it may be relevant to back up this file on each server.