Upgrade a Causal Cluster to 4.4

This section describes how to upgrade a Neo4j Causal Cluster.

You can upgrade your existing Neo4j Causal Cluster by either performing a rolling upgrade, or by upgrading it offline.

The prerequisites and the upgrade steps must be completed for each cluster member.

Offline upgrade

This variant is suitable for cases where a rolling upgrade is not possible.

It is recommended to perform a test upgrade on a production-like environment to get information on the duration of the downtime.

Prerequisites

  1. Ensure that you have completed all tasks on the Upgrade checklist.

When performing a rolling upgrade onto Neo4j 4.4, ensure that the version upgrading from is the latest release of 4.3, or the upgrade may fail.

Prepare for the upgrade

  1. Shut down all the cluster members (Cores and Read Replicas).

  2. Perform neo4j-admin unbind on each cluster member to remove cluster state data.

  3. Install the Neo4j version that you want to upgrade to on each instance. For more information on how to install the distribution that you are using, see Operations Manual 4.4 → Installation.

  4. Replace the neo4j.conf file with the one that you have prepared for each instance in section Prepare a new neo4j.conf file to be used by the new installation.

  5. Copy all the files used for encryption, such as private key, public certificate, and the contents of the trusted and revoked directories (located in <neo4j-home>/certificates/).

  6. Restore each of your databases and transactions in the new installation, including the system database, by either using neo4j-admin restore (online) or neo4j-admin load (offline), depending on your backup approach. If you are running a Debian/RPM distribution, you can skip this step.

    If your old installation has modified configurations starting with dbms.directories.* or the setting dbms.default_database, verify that the new neo4j.conf file is configured properly to find these directories.

  7. If you are using custom plugins, make sure they are updated and compatible with the new version, and place them in the /plugins directory.

Upgrade your cluster

On one of the Cores
  1. Open the neo4j.conf file of the new installation and configure the following settings:

  2. Start Neo4j by running the following command from <neo4j-home>:

    bin/neo4j start

    The upgrade takes place during startup.

  3. Monitor the neo4j.log file for information on how many steps the upgrade will involve and how far it has progressed.

  4. When the upgrade finishes, stop the server.

  5. Open the neo4j.conf file and configure the following settings:

  6. Use neo4j-admin dump to make a dump of each of your databases and transactions, including the system database.

  7. Do not yet start the server.

On each of the other Cores
  1. Copy the database dumps you created on the first Core server to each of the other Cores.

  2. Use neo4j-admin load --from=<archive-path> --database=<database> --force to replace each of your databases, including the system database, with the ones you upgraded on the first Core server.

  3. Start each of the core servers, including the first one, and verify that they join in a cluster.

For each Read Replica

Start the Read Replica and wait for it to catch up with the rest of the cluster members.

(Optional) Whilst an empty read replica will eventually get a full copy of all data from the other members of your cluster, this catching up may take some time. To speed up the process, you can load the data first by using neo4j-admin load --from=<archive-path> --database=<database> --force to replace each of your databases with the upgraded ones, including the system database.

Verify that the Read Replicas join the cluster.

Post-upgrade

It is recommended to perform a full backup, using an empty target directory.

Rolling upgrade

Rolling upgrade is a zero-downtime method for upgrading a Causal Cluster. You upgrade one member at a time, while the rest of the members are running. However, if during a rolling upgrade the cluster loses quorum and cannot be recovered, then downtime may be required to do a disaster recovery.

Recommendations
  • The critical point during the upgrade is knowing when it is safe to switch off the original member.
    It is highly recommended to monitor the status endpoint before each removal, in order to decide which member to switch off and when it is safe to do so.

  • To reduce the risk of failure during a rolling upgrade, make sure the cluster is not under any heavy load during the upgrade. If possible, the safest would be to disable writes entirely.

  • There should be no changes to database administration during a rolling upgrade. For more information, see Operations Manual 4.4 → Manage databases.

  • Remove dbms.record_format from neo4j.conf to avoid any accidental cross-format migration.

Rolling upgrade for a fixed number of servers

This variant is suitable for deployments where there is a fixed number of servers and they have to be updated in-place.

When performing a rolling upgrade for a fixed number of servers, it is not possible to increase the cluster size. Therefore, the cluster fault tolerance level will be reduced while replacing the members.

Prerequisites

  1. Ensure that you have completed all tasks on the Upgrade checklist.

    When performing a rolling upgrade onto Neo4j 4.4, ensure that the version upgrading from is the latest release of 4.3, or the upgrade may fail.

  2. Verify that all databases are online by running SHOW DATABASES in Cypher® Shell or Neo4j Browser. Offline databases can be started using START DATABASE [database-name].

    All databases must be started before you start a rolling upgrade. If you have to keep а database inaccessible during the rolling upgrade, you can disable access to it by using the following command:

    DENY ACCESS ON DATABASE [database-name] TO PUBLIC

    You must never run DENY ACCESS ON DATABASE system TO PUBLIC or DENY ACCESS ON DATABASE * TO PUBLIC because you will lock yourself out of the system database. If you do lock yourself out, follow the disable authentication steps in the Operations Manual to recover and prevent outside access to the instance or cluster.

  3. Ensure that the databases cannot be stopped, created, or dropped during the rolling upgrade by using the following command:

    DENY STOP ON DATABASE * TO PUBLIC
    DENY DATABASE MANAGEMENT ON DBMS TO PUBLIC

Upgrade the cluster

You upgrade one cluster member at a time, while the rest of the members are running.

If during a rolling upgrade the cluster loses quorum and cannot be recovered, then downtime may be required to do a disaster recovery.

For each cluster member
  1. (Recommended) Use the process described in the status endpoint to evaluate whether it is safe to remove the old instance.

  2. Shut down the instance.

  3. Install the Neo4j version that you want to upgrade to. For more information on how to install the distribution that you are using, see Operations Manual 4.4 → Installation.

  4. Replace the neo4j.conf file with the one that you have prepared for this instance in Prepare a new neo4j.conf file to be used by the new installation and set dbms.allow_upgrade=true to allow automatic store upgrade.

  5. Start the new instance and wait for it to catch up with the rest of the cluster members.

  6. Verify that the new instance has successfully joined the cluster and caught up with the rest of the members, by using the status endpoint.

  7. In the neo4j.conf file, configure the following settings:

    1. Set dbms.allow_upgrade=false to disable the automatic store upgrade.

    2. Restore any custom value of dbms.record_format if it was previously removed.

Because Read Replicas are not part of the cluster consensus group, their replacement during an upgrade does not affect the cluster availability and fault tolerance level. However, it is still recommended to incrementally add Read Replicas for a structured and maintainable upgrade process.

Upgrade the system database

In 4.x versions, Neo4j uses a shared system database, which includes complex information, such as the security configuration for users, roles, and their privileges. The structure of the graph contained in the system database changes with each new version of Neo4j as the capabilities of the DBMS grow. Therefore, each time a Neo4j deployment is upgraded, the contents of the system database, or schema, must be transformed as well. When performing an offline upgrade of a single deployment or a Causal cluster, these changes happen automatically, as a consequence of configuring dbms.mode=SINGLE (See Prepare for the upgrade and Upgrade your cluster). However, when performing a rolling upgrade, you never start instances with the configuration value dbms.mode=SINGLE, i.e., updating the system database automatically is not possible.

Any specific metrics that you want to be enabled must be specified in the metrics.filter.
For more information, see Operations Manual 4.4 → Enable metrics logging.

Compatibility and synchronization

With a causal cluster of many instances, while upgrading each instance in turn, there is a period during which the cluster is composed of some old and some new instances. A single system database is consistently replicated across the entire cluster. As a result, it is not possible to have its schema structured according to the needs of the new Neo4j version on some instances and the old version on others.

When the system database is not up-to-date with the Neo4j version of a given instance, that instance runs in compatibility mode. This means that capabilities common to both Neo4j versions will continue to work, but the features that require the new schema, will be disabled. For example, if you attempt to grant a new privilege not supported in the old schema, you will receive an error and the grant will fail. Therefore, when the rolling upgrade finishes, you must manually upgrade the system database schema in order to access all new features.

If the system database’s schema is too old to allow compatibility mode, the server will not start. For more information, see Troubleshooting.

Manually trigger an upgrade of the system database
  1. On one of the cluster members, call the procedure dbms.upgradeStatus() to determine whether an upgrade is necessary:

    CALL dbms.upgradeStatus();
    +-------------------------------------------------------------------------------------------------------------------------+
    | status             | description                                                                | resolution            |
    +-------------------------------------------------------------------------------------------------------------------------+
    | "REQUIRES_UPGRADE" | "The sub-graph is supported, but is an older version and requires upgrade" | "CALL dbms.upgrade()" |
    +-------------------------------------------------------------------------------------------------------------------------+

    For the full list of possible status values, see Status codes for dbms.upgradeStatus.

  2. On one of the cluster members, perform the upgrade by calling the procedure dbms.upgrade() against the system database:

    CALL dbms.upgrade();
    +---------------------------+
    | status    | upgradeResult |
    +---------------------------+
    | "CURRENT" | "Success"     |
    +---------------------------+

    Since Neo4j uses a shared system database, the upgraded system database will replicate across the entire cluster. If the upgrade fails for some reason, the status will not change, and the upgradeResult field will describe which components have failed to upgrade.

Post-upgrade steps

The following steps must be performed after a rolling upgrade.

  1. Restore the privilege of the PUBLIC role to stop databases:

    REVOKE DENY STOP ON DATABASE * FROM PUBLIC
  2. Restore the privilege of the PUBLIC role to create and drop databases:

    REVOKE DENY DATABASE MANAGEMENT ON DBMS FROM PUBLIC
  3. (Optional) If you have started offline databases during the preparation phase for a rolling upgrade, you stop each of them to restore them to the original state:

    STOP DATABASE [database-name]
  4. (Recommended) Perform a full backup, using an empty target directory.

Rolling upgrade for cloud infrastructure

This variant is suitable for deployments that use replaceable cloud or container resources. It follows the same steps as for the fixed number of servers, but you can add the new members before you shut down the old ones, thus preserving the cluster fault tolerance level. Because Read Replicas are not part of the cluster consensus group, their replacement during the upgrade will not affect the cluster availability and fault tolerance level. However, it is still recommended to incrementally add Read Replicas for a structured and maintainable upgrade process.