Migrate a Causal Cluster to 4.0

This chapter describes the necessary steps to migrate a Causal Cluster from Neo4j 3.5 to 4.0.

The migration of a Causal Cluster from Neo4j 3.5 to 4.0 requires downtime. Therefore, it is recommended to perform a test migration in a production-like environment to get information on the duration of the downtime.

To migrate from 3.5.latest to a version beyond 4.0, the cluster must first be migrated to 4.0 and thereafter upgraded to the desired version. For more information, see Supported upgrade and migration paths.

The prerequisites and the migration steps must be completed for each cluster member.

Prerequisites

Ensure that you have completed all tasks on the Migration checklist.

Prepare for the migration

The strategy for migrating a cluster deployment is to complete a single migration on a single instance, as a standalone instance, and then use the migrated store to seed the remaining members of the cluster.

Remember, migration is a single event. Do not perform independent migrations on each of your instances! There should be a single migration event and that migrated store will be your source of truth for all the other instances of the cluster. This is important because when migrating, Neo4j generates random store IDs and, if done independently, your cluster will end up with as many store IDs as instances you have. Neo4j will fail to start if that is the case. Due to this, some of the cluster migrations steps will be performed on a single instance while others will be performed on all instances. Each step will tell you where to perform the necessary actions.

At this stage, you should elect one instance to work on. This will be the instance where the migration will actually happen. The next steps will tell you whether to perform the step on the elected instance, on the remaining instances or on all instances.

On each cluster member

Verify that you have shut down all cluster members (Cores and Read Replicas). You can check the neo4j.log.
Perform neo4j-admin unbind on each cluster member to remove cluster state data.
Install the Neo4j version that you want to migrate to on each instance. For more information on how to install the distribution that you are using, see Operations Manual 4.0 → Installation.
Replace the neo4j.conf file with the one that you have prepared for each instance in section Prepare a new neo4j.conf file to be used by the new installation.
Copy all the files used for encryption, such as private key, public certificate, and the contents of the trusted and revoked directories (located in <NEO4J_HOME>/certificates/).
If you are using custom plugins, make sure they are updated and compatible with the new version, and place them in the /plugins directory.

On the elected instance

Open the neo4j.conf file of the new installation and configure the following settings:
- Uncomment dbms.allow_upgrade=true to allow automatic store migration. Neo4j will fail to start without this configuration.
- Set dbms.mode=SINGLE. You need to do this because a migration is a single event that needs to happen on a standalone server.

Migrate the data

On the elected instance: Before migrating the data, you need to move the backup file to the data directory of Neo4j 4.0.

This step is not applicable if you have dbms.directories.data pointing to a directory outside of <NEO4J_HOME>.

Move the backup file into the new installation by running the neo4j-admin load command from <NEO4J_HOME>:

It is a requirement to use this command as file system copy-and-paste of databases is not supported and may result in unwanted behaviour. If you are running a Debian/RPM distribution, you can skip this step.

$NEO4J_HOME/bin/neo4j-admin load --from=$BACKUP_DESTINATION/<db_name>.dump –database=<db_name> --force

The migration of users and roles from 3.5 to 4.0 is done automatically. Therefore, you do not have to move the data/dbms/ directory and contents to the new installation. The files in 3.5 will be parsed and the content added to the system database on the first startup of the Neo4j 4.0 DBMS.

With the backup in-place, initiate the migration by starting the elected instance:

bin/neo4j start

systemctl start neo4j

The migration takes place during startup. Your indexes are also automatically migrated to the most recent index provider during startup. However, indexes require populating after migration. Your queries will not be able to use the indexes while they are populating so you may see a temporary performance hit. If your queries are 100% dependent on some indexes, it is advisable to account for index population as part of the duration of the migration.

The neo4j.log file contains valuable information on how many steps the migration involves and how far it has progressed. Index populations are tracked on the debug.log. For large migrations, it is a good idea to monitor these logs continuously.

When the migration finishes, stop the server.
```
bin/neo4j stop
```
or
```
systemctl stop neo4j
```

Prepare for seeding the cluster

On the elected instance

Revert neo4j.conf changes:
- Set dbms.allow_upgrade=false.
- Set dbms.mode=CORE to re-enable Causal Clustering in the configuration.
Take an offline backup of your newly migrated database and transactions, alongside with the system database using neo4j-admin dump. This backup will be used to seed the remaining instances of the cluster.
```
bin/neo4j-admin dump --database=<db_name> --to=$BACKUP_DESTINATION/<db_name>.dump

bin/neo4j-admin dump --database=system --to=$BACKUP_DESTINATION/system.dump
```
Be aware that after you migrate, Neo4j Admin commands can differ slightly because Neo4j now supports multiple databases.
Do not yet start the server.

Seed the cluster

On each of the remaining instances

Copy the dumps created in Migration steps to the remaining instances.
Once this is complete, use neo4j-admin load --from=<archive-path> --database=<db_name> --force to replace each of your databases, including the system database, with the ones migrated on the elected instance.
```
bin/neo4j-admin load --from=$BACKUP_DESTINATION/<db_name>.dump --database=<db_name> --force

bin/neo4j-admin load --from=$BACKUP_DESTINATION/system.dump --database=system --force
```

Start the cluster

On each cluster member, including the elected instance

Before continuing, make sure the following activities happened and were completed successfully:

Content of neo4j.conf is correct and required changes were applied on all instances.
Single migration event occurred on elected instance.
Backup (via neo4j-admin dump) of migrated store performed on the elected instance.
Backup of the migrated store was transferred to the remaining instances.
Store was loaded on the remaining instances (via neo4j-admin load).
dbms.mode=CORE and dbms.allow_upgrade=false (or commented) are set on neo4j.conf of the elected instance.

If everything on the list was successful, you can go ahead and start all instances of the cluster.
```
bin/neo4j start
```
or
```
systemctl start neo4j
```
If the migrated database is the default database, it should have been started automatically on instance startup and this step is not required. If the migrated database is not the default database, it is still in the STOPPED state. You now need to start the database. On one of the cluster members, run the following command in Neo4j Browser or Cypher^® Shell:
```
CREATE DATABASE <db_name>;
```
For each Read Replica

Start the Read Replica and wait for it to catch up with the rest of the cluster members.

(Optional) While an empty read replica will eventually get a full copy of all data from the other members of your cluster, catching up may take some time. To speed up the process, you can load the data first by using neo4j-admin load --from=<archive-path> --database=<db_name> --force to replace each of your databases with the migrated one.

Verify that the Read Replicas have joined the cluster.

Post-migration

It is recommended to perform a full backup, using an empty target directory.