Migrate a cluster (Package manager)

This example shows how to migrate a v4.4 cluster to v5.x using a package manager.
Migration to Neo4j 5 is supported only from version v4.4.

It is recommended to read the following pages before continuing:

The following example steps assume that the environment has three cluster members on v4.4 version and Neo4j DBMS is running as systemd service. They use the RPM package manager and Yum as a front end. However, the steps for another package and service manager will be very similar.

For more information on how to configure the Neo4j Yum repository, see Operations Manual → Deploy Neo4j using the Neo4j RPM package. For more information about the supported package and service managers, see Operations Manual → Linux installation.

In an RPM setup, a package update only replaces the binaries located in /usr/share/neo4j. The remaining files, such as the Neo4j config, database store files, state files, certificates, and plugins, are left unchanged. The Neo4j /bin directory must be on a user PATH (default for the RPM package).

Prepare the v4.4 cluster for migration

Prepare the v4.4 cluster for migration by recreating the indexes and the index-backed constraints to match the new index types, and by backing up each of your databases.

Recreate indexes and index-backed constraints

In v5.0, the BTREE index type is no longer available. Therefore, it is recommended to recreate all your BTREE indexes and index-backed constraints with index providers native-btree-1.0 or lucene+native-3.0 before switching to v5.x. During the migration, v5.x checks whether each BTREE index and index-backed constraint has an equivalent type of index and provider, and drops them.

What type of index to use instead of BTREE?

In most cases, RANGE indexes can replace BTREE. However, there might be occasions when a different index type is more suitable, such as:

  • Use POINT indexes if the property value type is point and distance or bounding box queries are used for the property.

  • Use TEXT indexes if the property value type is text and the values can be larger than 8Kb.

  • Use TEXT indexes if the property value type is text and CONTAINS and ENDS WITH are used in queries for the property.

If more than one of the conditions is true, it is possible to have multiple indexes of different index types on the same schema. For more information on each index type, see Operations Manual v5.0 → Index configuration.

Steps

  1. Recreate each of your BTREE indexes on the same schema but using the new type (RANGE, POINT, or TEXT) as per your use case. The following example creates a range index on a single property for all nodes with a particular label:

    CREATE RANGE INDEX range_index_name FOR (n:Label) ON (n.prop1)
  2. Recreate each of your index-backed constraints with index providers native-btree-1.0 or lucene+native-3.0 on the same schema but with the new provider. The following example creates a unique node property constraint on a single property for all nodes with a particular label. The backing index is of type range with range-1.0 index provider.

    CREATE CONSTRAINT constraint_with_provider FOR (n:Label) REQUIRE (n.prop1) IS UNIQUE OPTIONS {indexProvider: 'range-1.0'}
  3. Run SHOW INDEXES to verify that the indexes have been populated and constraints have been created with the correct index provider.

For more information about creating indexes, see Cypher Manual → Creating indexes.

Backup each of your databases

You can back up all databases from a single v4.4 cluster member.

To ensure the databases do not get updated during the backup, put them into read-only mode using the Cypher command ALTER DATABASE <databasename> SET ACCESS READ ONLY. Run the Cypher command SHOW DATABASES YIELD * and choose the member that is up-to-date with the last committed transaction on all databases as a backup source.

On an up-to-date member of the v4.4 cluster

  1. Create a directory to store backups. These steps use /migration-backups.

  2. Run the neo4j-admin backup command to back up each of your databases.

    • All databases that you want to back up must be online.

    • The command must be invoked as the neo4j user to ensure the appropriate file permissions.

    • Use the option --include-metadata=all to include all roles and users associated with each of your databases.

    /usr/bin/neo4j-admin backup --database=<databasename>  --backup-dir=/migration-backups --include-metadata=all
  3. Ensure that you have successfully backed up all your databases. The result is a folder for each database, called <databasename> and located in the /migration-backups folder, and a metadata script for each database, located in migration-backups/databasename/tools/metadata_script.cypher. For more information about the neo4j-admin backup command, see Operations Manual 4.4 → Back up an online database.

On each member of the v4.4 cluster

  1. Stop the Neo4j DBMS.

    sudo systemctl stop neo4j

    It is important to verify that the database shutdown process has finished successfully and that the database is cleanly shut down. You can check the neo4j.log using journalctl -e -u neo4j.

  2. Delete all the content in /var/lib/neo4j/data/ to clear the old state of the databases.

Set up the v5.x cluster

Prepare the v5.x cluster for the migration

On each server

  1. Make sure that you are using Java 17.

  2. (Optional) If you are using custom plugins, ensure they are updated and compatible with Java 17.

  3. Update the Neo4j DBMS:

    sudo yum update neo4j
  4. Migrate the v4.4 configuration file to a 5.x-compatible format.

    The Neo4j Admin commands must be invoked with the same user as Neo4j runs as. By default, this user is called neo4j. This guarantees that Neo4j will have full rights to start and work with the database files you use.

    /usr/bin/neo4j-admin server migrate-configuration
  5. Verify that the configuration file looks as expected.

    It is strongly recommended to read up on the new 5.x settings to fully take advantage of the 5.x features.

    Refer to Operations Manual → Set up a cluster and update the settings accordingly. In particular:

    • Make sure all the settings within the dbms and database namespace are the same across all servers. The server settings are the only settings that may vary.

    • It is required to explicitly set a port for all advertised addresses (except server.default_advertised_address) as they will no longer default to the corresponding listen address port.

  6. Start each v5.x cluster server:

    sudo systemctl start neo4j

Migrate the database backups

You migrate the database backups only to one of the new servers.

On a v5.x server a

  1. Use the neo4j-admin restore command to restore each of your databases except the system database:

    Assuming that neo4j is the default database and you want to restore the 4.4 neo4j database backup, first, you need to run DROP DATABASE neo4j IF EXISTS.

    /usr/bin/neo4j-admin database restore <databasename> --from-path=/migration-backups/<databasename>
  2. Migrate the <databasename> database:

    /usr/bin/neo4j-admin database migrate <databasename>
  3. Run SHOW SERVERS YIELD address, serverId to retrieve the server ID for server a.

  4. Recreate each of your migrated databases using one of the options:

    • If you have set the number of primaries and secondaries for your databases using initial.dbms.default_primaries_count (default is 1) and initial.dbms.default_secondaries_count (default is 0) settings, run the command:

      CREATE DATABASE <databasename> {existingData: 'use', existingDataSeedInstance: '[ServerId for a]'}
    • If you have not specified the number of primaries and secondaries for your databases, or if you wish to override the default values, specify the topology when recreating the databases:

      CREATE DATABASE <databasename> TOPOLOGY [desired number of primaries] PRIMARIES [desired number of secondaries] SECONDARIES OPTIONS {existingData: 'use', existingDataSeedInstance: '[ServerId for a]'}

      If you do not specify the number of primaries and secondaries, all databases will be on the same server.

  5. (Optional) Restore the roles and privileges associated with each of your databases by running the respective metadata script /usr/data/scripts/databasename/restore_metadata.cypher, which the neo4j-admin restore command output, using Cypher Shell:

    Using cat (UNIX)
    cat data/scripts/databasename/restore_metadata.cypher | bin/cypher-shell -u user -p password -a ip_address:port -d system --param "database => 'databasename'"
  6. Run REALLOCATE DATABASES to evenly distribute your databases among the servers.

Monitor the logs

The neo4j.log file contains information on how many steps the migration will involve and how far it has progressed. You can monitor this log using journalctl -e -u neo4j.