Upgrade a Causal Cluster

This section describes how to upgrade a Neo4j Causal Cluster.

You can upgrade your existing Neo4j Causal Cluster by either performing a rolling upgrade, or by upgrading it offline.

The prerequisites and the upgrade steps must be completed for each cluster member.

1. Offline upgrade

This variant is suitable for cases where a rolling upgrade is not possible.

It is recommended to perform a test upgrade on a production-like environment to get information on the duration of the downtime.

1.1. Prerequisites

  1. Verify that you have installed Java 11.

  2. Review the improvements and fixes that have been carried out in the version that you want to upgrade to. See the Neo4j 4.1 Change log.

  3. Ensure that you have completed all tasks on the Upgrade checklist.

1.2. Prepare for the upgrade

  1. Shut down all the cluster members (Cores and Read Replicas).

  2. Perform neo4j-admin unbind on each cluster member to remove cluster state data.

  3. Install the Neo4j version that you want to upgrade to on each instance. For more information on how to install the distribution that you are using, see Operations Manual v4.1 → Installation.

  4. Update the neo4j.conf file as per the notes that you have prepared for each instance in section Prepare a new neo4j.conf file to be used by the new installation.

  5. Copy the files used for encryption from the old installation to the new one.

  6. Restore each of your databases and transactions in the new installation, including the system database, by either using neo4j-admin restore (online) or neo4j-admin load (offline), depending on your backup approach.

    If your old installation has modified configurations starting with dbms.directories.* or the setting dbms.default_database, verify that the new neo4j.conf file is configured properly to find these directories.

  7. If using custom plugins, make sure they are updated and compatible with the new version, and place them in the /plugins directory.

1.3. Upgrade your cluster

On one of the Cores
  1. Open the neo4j.conf file of the new installation and configure the following settings:

  2. Start Neo4j by running the following command from <neo4j-home>:

    bin/neo4j start

    The upgrade takes place during startup.

  3. Monitor the neo4j.log file for information on how many steps the upgrade will involve and how far it has progressed.

  4. When the upgrade finishes, stop the server.

  5. Open the neo4j.conf file and configure the following settings:

  6. Use neo4j-admin dump to make a dump of each of your databases and transactions, including the system database.

  7. Do not yet start the server.

On each of the other Cores
  1. Copy the database dumps you created on the first Core server to each of the other Cores.

  2. Use neo4j-admin load --from=<archive-path> --database=<database> --force to replace each of your databases, including the system database, with the ones you upgraded on the first Core server.

  3. Start each of the core servers, including the first one, and verify that they join in a cluster.

For each Read Replica

Start the Read Replica and wait for it to catch up with the rest of the cluster members.

(Optional) Whilst an empty read replica will eventually get a full copy of all data from the other members of your cluster, this catching up may take some time. In order to speed up the process, you can load the data first by using neo4j-admin load --from=<archive-path> --database=<database> --force to replace each of your databases with the upgraded ones, including the system database.

Verify that the Read Replicas join the cluster.

1.4. Post-upgrade

It is recommended to perform a full backup, using an empty target directory.

2. Rolling upgrade

Rolling upgrade is a zero-downtime method for upgrading a Causal Cluster. You upgrade one member at a time, while the rest of the members are running. However, if during a rolling upgrade the cluster loses quorum and cannot be recovered, then downtime may be required to do a disaster recovery.

Recommendations
  • The critical point during the upgrade is knowing when it is safe to switch off the original member.
    It is highly recommended to monitor the status endpoint before each removal, in order to decide which member to switch off and when it is safe to do so.

  • To reduce the risk of failure during a rolling upgrade, make sure the cluster is not under any heavy load during the upgrade. If possible, the safest would be to disable writes entirely.

  • There should be no changes to database administration during a rolling upgrade. For more information, see Operations Manual v4.1 → Manage databases.

2.1. Rolling upgrade for a fixed number of servers

This variant is suitable for deployments where there is a fixed number of servers and they have to be updated in-place.

When performing a rolling upgrade for a fixed number of servers, it is not possible to increase the cluster size. Therefore, the cluster fault tolerance level will be reduced while replacing the members.

2.1.1. Prerequisites

  1. Verify that you have installed Java 11.

  2. Review the improvements and fixes that have been carried out in the version that you want to upgrade to. See the Neo4j 4.1 Change log.

  3. Ensure that you have completed all tasks on the Upgrade checklist.

  4. Verify that all databases are online by running SHOW DATABASES in Cypher Shell or Neo4j Browser. Offline databases can be started using START DATABASE [database-name].

    All databases must be started before you start a rolling upgrade. If you have to keep а database inaccessible during the rolling upgrade, you can disable access to it by using the following command:

    DENY ACCESS ON DATABASE [database-name] TO PUBLIC

    When upgrading from Neo4j 4.0.x, you have to disable access to each role that have access to that particular database, as the PUBLIC will not yet exist:

    DENY ACCESS ON DATABASE [database-name] TO [role1],[role2]

    All available roles can be queried with SHOW ROLES.

  5. Ensure that the databases cannot be stopped, created, or dropped during the rolling upgrade by using the following command:

    DENY STOP ON DATABASE * TO PUBLIC
    DENY DATABASE MANAGEMENT ON DBMS TO PUBLIC

    When upgrading from Neo4j 4.0.x, you can only disable the ability to stop databases.

    DENY STOP ON DATABASE * TO admin

    This must be done for the admin role and all other roles that have the privilege to stop databases. For more information about listing privileges, see Cypher Manual v4.1 → Graph and sub-graph access control.

2.1.2. Upgrade the cluster

You upgrade one cluster member at a time, while the rest of the members are running.

If during a rolling upgrade the cluster loses quorum and cannot be recovered, then downtime may be required to do a disaster recovery.

For each cluster member
  1. (Recommended) Use the process described in the status endpoint to evaluate whether it is safe to remove the old instance.

  2. Shut down the instance.

  3. Install the Neo4j version that you want to upgrade to. For more information on how to install the distribution that you are using, see Operations Manual v4.1 → Installation.

  4. Update the neo4j.conf file as per the notes that you have prepared for this instance in Prepare a new neo4j.conf file to be used by the new installation.

  5. Start the new instance and wait for it to catch up with the rest of the cluster members.

  6. Verify that the new instance has successfully joined the cluster and caught up with the rest of the members, by using the status endpoint.

Because Read Replicas are not part of the cluster consensus group, their replacement during an upgrade does not affect the cluster availability and fault tolerance level. However, it is still recommended to incrementally add Read Replicas for a structured and maintainable upgrade process.

2.1.3. Upgrade the system database

In 4.x versions, Neo4j uses a shared system database, which includes complex information, such as the security configuration for users, roles, and their privileges. The structure of the graph contained in the system database changes with each new version of Neo4j as the capabilities of the DBMS grow. Therefore, each time a Neo4j deployment is upgraded, the contents of the system database, or schema, must be transformed as well. When performing an offline upgrade of a single deployment or a Causal cluster, these changes happen automatically, as a consequence of configuring dbms.mode=SINGLE (See Prepare for the upgrade and Upgrade your cluster). However, when performing a rolling upgrade, you never start instances with the configuration value dbms.mode=SINGLE, i.e., updating the system database automatically is not possible.

Compatibility and synchronization

With a causal cluster of many instances, while upgrading each instance in turn, there is a period during which the cluster is composed of some old and some new instances. A single system database is consistently replicated across the entire cluster. As a result, it is not possible to have its schema structured according to the needs of the new Neo4j version on some instances and the old version on others.

When the system database is not up-to-date with the Neo4j version of a given instance, that instance runs in compatibility mode. This means that capabilities common to both Neo4j versions will continue to work, but the features that require the new schema, will be disabled. For example, if you attempt to grant a new privilege not supported in the old schema, you will receive an error and the grant will fail. Therefore, when the rolling upgrade finishes, you must manually upgrade the system database schema in order to access all new features.

If the system database’s schema is too old to allow compatibility mode, the server will not start. For more information, see Troubleshooting.

Manually trigger an upgrade of the system database
  1. Determine whether an upgrade is necessary or not by calling the procedure dbms.upgradeStatus():

    CALL dbms.upgradeStatus();
    +-------------------------------------------------------------------------------------------------------------------------+
    | status             | description                                                                | resolution            |
    +-------------------------------------------------------------------------------------------------------------------------+
    | "REQUIRES_UPGRADE" | "The sub-graph is supported, but is an older version and requires upgrade" | "CALL dbms.upgrade()" |
    +-------------------------------------------------------------------------------------------------------------------------+

    For the full list of possible status values, see Status codes for dbms.upgradeStatus.

  2. Perform the upgrade by calling the procedure dbms.upgrade() against the system database.

    CALL dbms.upgrade();
    +---------------------------+
    | status    | upgradeResult |
    +---------------------------+
    | "CURRENT" | "Success"     |
    +---------------------------+

    If the upgrade fails for some reason, the status will not change, and the upgradeResult field will describe which components have failed to upgrade.

2.1.4. Post-upgrade steps

The following steps must be performed after a rolling upgrade.

  1. Restore the privilege of the PUBLIC role to stop databases:

    REVOKE DENY STOP ON DATABASE * FROM PUBLIC
  2. Restore the privilege of the PUBLIC role to create and drop databases:

    REVOKE DENY DATABASE MANAGEMENT ON DBMS FROM PUBLIC
  3. (Optional) If you have started offline databases and denied some access rights during the preparation phase for a rolling upgrade, you should also restore them to the original state:

    1. Stop each of the databases by running the following command:

      STOP DATABASE [database-name]
    2. Re-enable access to the databases by running the following command:

      REVOKE DENY ACCESS ON DATABASE [database-name] FROM [role1],[role2]
  4. (Recommended) Perform a full backup, using an empty target directory.

2.2. Rolling upgrade for cloud infrastructure

This variant is suitable for deployments that use replaceable cloud or container resources. It follows the same steps as for the fixed number of servers, but you can add the new members before you shut down the old ones, thus preserving the cluster fault tolerance level. Because Read Replicas are not part of the cluster consensus group, their replacement during the upgrade will not affect the cluster availability and fault tolerance level. However, it is still recommended to incrementally add Read Replicas for a structured and maintainable upgrade process.