For operators of Neo4j looking to implement regular hot backups of a Neo4j database using automated job scheduling, applying this wrapper around neo4j-admin backup will make your job easier. Simply read this guide, customize according to your environment, and set your job scheduler to run the backupDatabase.sh wrapper script.

DISCLAIMER: This backup script will work with Neo4j version 3.2.9 and above. For Neo4j versions less than version 3.2.9, please contact Neo4j Customer Support for information on how to implement the backup script.

First, download the script here.

To setup the backup wrapper script on Linux systems, follow these steps:

As the neo4j user:

  1. Copy the backupDatabase.sh script to $NEO4J_HOME/bin
  2. Review and modify the script User and Mail variables as required.
  3. The default log location is $NEO4J_HOME/logs/backupDatabase_$timestamp.log. The location and timestamped naming convention can be modified as required in the script under the logFile= variable.
  4. Run chmod 750 $NEO4J_HOME/bin/backupDatabase.sh to set execute permissions on the script
  5. Show options by running $NEO4J_HOME/bin/backupDatabase.sh -h:
    -d Backup destination directory to save files to (Mandatory)"
    -b Backup name : Default graph.db-backup (Optional)"
    -f Set From Host : Default localhost (Optional)"
    -c Run Consistency Check [true] : Default false (Optional)"
    -m Set HEAP_SIZE : Default 2G (Optional)"
    -p Set pache cache : Default 4G (Optional)"
    -h  Print help message"

To run a backup to the backup location directory:

$ ./backupDatabase.sh -d /<full_path_to_backup_dir>/backup

NOTE 1: Neo4j will automatically determine whether to run a full or incremental backup, by default the script is designed to run incremental backups to the mandatory backup location provided by the -d flag. To perform an incremental backup you need to specify the same location of your previous backup.

If a full backup is desired, the script can be modified to comment out the following line in the script as shown:

#bname=graph.db-backup #POSSIBLY CHANGE ME, USED FOR INCREMENTAL BACKUPS

And uncomment the following line in the script as shown:

bname=graph.db-backup-$timestamp #POSSIBLY CHANGE ME, USED FOR FULL BACKUPS

This will force the backup to use a new timestamped naming convention for the backup; if this is run daily, then a full backup will occur each day into the respective timestamped directory.

NOTE 2: The consistency check will be run offline by default on the backup once it is completed. The consistency check can be changed to run online during the backup on the active database by specifying the -c true flag, although this is not recommended for large databases.

NOTE 3: The script will verify whether the Neo4j database is online before proceeding and will exit if the database is not running.

NOTE 4: The default setting for --fallback-to-full is set to true. If an incremental backup fails, the script will move the old backup to <name>.err.<N> and fallback to a full backup instead.

To run a backup against a remote server:

$ ./backupDatabase.sh -d /<full_path_to_backup_dir>/backup -f 192.168.0.10:6362

NOTE 1: The default for the -f setting is localhost:6362. You can specify an alternate remote host IP or IP:Port.

To run a backup to the backup location directory with a defined backup name:

$ ./backupDatabase.sh -d /<full_path_to_backup_dir>/backup -b graph.db-TEST

NOTE 1: Neo4j will automatically determine whether to run a full or incremental backup, by default the script is designed to run incremental backup. If the graph.db-TEST directory in this example does NOT exist, then a full backup will occur.

To run a backup to the backup location directory with a defined pagecache setting:

$ ./backupDatabase.sh -d /<full_path_to_backup_dir>/backup -p 15G

NOTE 1: The default pagecache setting is 4G

To run a backup to the backup location directory with a defined HEAP and pagecache setting:

$ ./backupDatabase.sh -d /<full_path_to_backup_dir>/backup -m 8G -p 15G

NOTE 1: The default HEAP size setting is 2G and the default pagecache setting is 4G

Details


Author:
Shawn Tozeski
Applicable versions:
3.2.9, 3.3, 3.4
Keywords:
backupneo4j-3.2.9neo4j-3.3neo4j-3.4