Upgrade a Causal Cluster

This section describes how to upgrade a Neo4j Causal Cluster.

You can upgrade your existing Neo4j Causal Cluster by either performing a rolling upgrade, or by upgrading it offline.

The prerequisites and the upgrade steps must be completed for each cluster member.

1. Offline upgrade

This variant is suitable for cases where a rolling upgrade is not possible.

It is recommended to perform a test upgrade on a production-like environment to get information on the duration of the downtime.

1.1. Prerequisites

  1. Verify that you have installed Java 11.

  2. Review the improvements and fixes that have been carried out in the version that you want to upgrade to. See the Neo4j 4.0 Change log.

  3. Ensure that you have completed all tasks on the Upgrade checklist.

1.2. Prepare for the upgrade

  1. Shut down all the cluster members (Cores and Read Replicas).

  2. Perform neo4j-admin unbind on each cluster member to remove cluster state data.

  3. Install the Neo4j version that you want to upgrade to on each instance. For more information on how to install the distribution that you are using, see Operations Manual v4.0 → Installation.

  4. Update the neo4j.conf file as per the notes that you have prepared for each instance in section Prepare a new neo4j.conf file to be used by the new installation.

  5. Copy the files used for encryption from the old installation to the new one.

  6. Restore each of your databases and transactions in the new installation, including the system database, by either using neo4j-admin restore (online) or neo4j-admin load (offline), depending on your backup approach.

    If your old installation has modified configurations starting with dbms.directories.* or the setting dbms.default_database, verify that the new neo4j.conf file is configured properly to find these directories.

  7. If using custom plugins, make sure they are updated and compatible with the new version, and place them in the /plugins directory.

1.3. Upgrade your cluster

On one of the Cores
  1. Open the neo4j.conf file of the new installation and configure the following settings:

  2. Start Neo4j by running the following command from <neo4j-home>:

    bin/neo4j start

    The upgrade takes place during startup.

  3. Monitor the neo4j.log file for information on how many steps the upgrade will involve and how far it has progressed.

  4. When the upgrade finishes, stop the server.

  5. Open the neo4j.conf file and configure the following settings:

  6. Use neo4j-admin dump to make a dump of each of your databases and transactions, including the system database.

  7. Do not yet start the server.

On each of the other Cores
  1. Copy the database dumps you created on the first Core server to each of the other Cores.

  2. Use neo4j-admin load --from=<archive-path> --database=<database> --force to replace each of your databases, including the system database, with the ones you upgraded on the first Core server.

  3. Start each of the core servers, including the first one, and verify that they join in a cluster.

For each Read Replica

Start the Read Replica and wait for it to catch up with the rest of the cluster members.

(Optional) While an empty Read Replica will eventually get a full copy of all data from the other members of your cluster, this catching up may take some time. In order to speed up the process, you can load the data first by using neo4j-admin load --from=<archive-path> --database=<database> --force to replace each of your databases with the upgraded ones, including the system database.

Verify that the Read Replicas join the cluster.

1.4. Post-upgrade

It is recommended to perform a full backup, using an empty target directory.

2. Rolling upgrade

Rolling upgrade is a zero-downtime method for upgrading a Causal Cluster. You upgrade one member at a time, while the rest of the members are running. However, if during a rolling upgrade the cluster loses quorum and cannot be recovered, then downtime may be required to do a disaster recovery.

Recommendations
  • The critical point during the upgrade is knowing when it is safe to switch off the original member.
    It is highly recommended to monitor the status endpoint before each removal, in order to decide which member to switch off and when it is safe to do so.

  • To reduce the risk of failure during a rolling upgrade, make sure the cluster is not under any heavy load during the upgrade. If possible, the safest would be to disable writes entirely.

  • There should be no changes to database administration during a rolling upgrade. For more information, see Operations Manual v4.0 → Manage databases.

2.1. Rolling upgrade for a fixed number of servers

This variant is suitable for deployments where there is a fixed number of servers and they have to be updated in-place.

When performing a rolling upgrade for a fixed number of servers, it is not possible to increase the cluster size. Therefore, the cluster fault tolerance level will be reduced while replacing the members.

2.1.1. Prerequisites

  1. Verify that you have installed Java 11.

  2. Review the improvements and fixes that have been carried out in the version that you want to upgrade to. See the Neo4j 4.0 Change log.

  3. Ensure that you have completed all tasks on the Upgrade checklist.

  4. Verify that all databases are online by running SHOW DATABASES in Cypher Shell or Neo4j Browser. Offline databases can be started using START DATABASE [database-name].

    All databases must be started before you start a rolling upgrade. If you have to keep а database inaccessible during the rolling upgrade, you can disable access to it by using the following command:

    DENY ACCESS ON DATABASE [database-name] TO [role1],[role2]

    All available roles can be queried with SHOW ROLES.

  5. Ensure that the databases cannot be stopped during the rolling upgrade by using the following command:

    DENY STOP ON DATABASE * TO admin

    This must be done for the admin role and all other roles that have the privilege to stop databases. For more information about listing privileges, see Cypher Manual v4.0 → Graph and sub-graph access control.

2.1.2. Upgrade the cluster

You upgrade one cluster member at a time, while the rest of the members are running.

If during a rolling upgrade the cluster loses quorum and cannot be recovered, then downtime may be required to do a disaster recovery.

For each cluster member
  1. (Recommended) Use the process described in the status endpoint to evaluate whether it is safe to remove the old instance.

  2. Shut down the instance.

  3. Install the Neo4j version that you want to upgrade to. For more information on how to install the distribution that you are using, see Operations Manual v4.0 → Installation.

  4. Update the neo4j.conf file as per the notes that you have prepared for this instance in Prepare a new neo4j.conf file to be used by the new installation.

  5. Start the new instance and wait for it to catch up with the rest of the cluster members.

  6. Verify that the new instance has successfully joined the cluster and caught up with the rest of the members, by using the status endpoint.

Because Read Replicas are not part of the cluster consensus group, their replacement during an upgrade does not affect the cluster availability and fault tolerance level. However, it is still recommended to incrementally add Read Replicas for a structured and maintainable upgrade process.

2.1.3. Post-upgrade steps

The following steps must be performed after a rolling upgrade.

  1. Restore the privilege of the admin role to stop databases.

    REVOKE DENY STOP ON DATABASE * FROM admin

    This must be done for all roles for which the privilege to stop databases has been denied (see step 6 of Rolling upgrade for a fixed number of servers). For more information about listing privileges, see Cypher Manual v4.0 → Graph and sub-graph access control.

  2. (Optional) If you have started offline databases and denied some access rights during the preparation phase for a rolling upgrade, you should also restore them to the original state:

    1. Stop each of the databases by running the following command:

      STOP DATABASE [database-name]
    2. Re-enable access to the databases by running the following command:

      REVOKE DENY ACCESS ON DATABASE [database-name] FROM [role1],[role2]

2.2. Rolling upgrade for cloud infrastructure

This variant is suitable for deployments that use replaceable cloud or container resources. It follows the same steps as for the fixed number of servers, but you can add the new members before you shut down the old ones, thus preserving the cluster fault tolerance level. Because Read Replicas are not part of the cluster consensus group, their replacement during the upgrade will not affect the cluster availability and fault tolerance level. However, it is still recommended to incrementally add Read Replicas for a structured and maintainable upgrade process.