Rolling upgrade of Neo4j cluster using Helm chart across 2025–2026 releases

This example demonstrates how to perform a rolling upgrade of a Neo4j cluster between any two releases within the 2025–2026 cycle.

Architecture overview

Understanding how the Neo4j Helm chart deploys cluster members is essential for planning upgrades.

Each Neo4j cluster member, a server, is deployed as a separate Helm release, each with its own StatefulSet containing a single replica (replicas: 1). This architecture differs from typical StatefulSet-based applications where all replicas live in one StatefulSet.

For a 3-server cluster, you can have:

Helm Release: server-1  →  StatefulSet: server-1  →  Pod: server-1-0
Helm Release: server-2  →  StatefulSet: server-2  →  Pod: server-2-0
Helm Release: server-3  →  StatefulSet: server-3  →  Pod: server-3-0

This design enables per-server offline maintenance mode and independent lifecycle management, but it means:

  • helm upgrade must be run once per cluster servers.

  • Each server’s PodDisruptionBudget (PDB) (if enabled) only covers its own single pod.

  • Cluster-wide disruption protection requires a separate, manually created PDB (see PDB section).

Rolling upgrade of a cluster

A rolling upgrade updates each cluster member, a server, one at a time, keeping the cluster available throughout the process. You must ensure the cluster returns to a healthy state before upgrading the next server.

Prerequisites

  • All cluster servers must be running the same version before starting.

  • The cluster must be in a healthy state.

  • Update the Helm repo before starting: helm repo update neo4j.

Key settings that affect upgrades

Setting Default Description

image.customImage

Unset, uses chart’s appVersion.

Override the Neo4j image. Use when you need a specific image from a private registry. When unset, the chart uses the Neo4j version matching the chart’s appVersion.

podSpec.terminationGracePeriodSeconds

3600

Time allowed for Neo4j to shut down gracefully. Neo4j flushes in-memory data to disk on shutdown.

Do not reduce this without understanding the implications.

neo4j.offlineMaintenanceModeEnabled

false

When true, restarts the pod without running Neo4j, allowing offline maintenance tasks like neo4j-admin database dump. Set back to false and run helm upgrade to resume normal operation.

Step-by-step procedure

Repeat the following steps for each cluster member, one at a time.

  1. Check the cluster is healthy.

    Before upgrading a server, verify the cluster is healthy. Check that all servers are hosting their assigned databases (the query should return no results):

    kubectl exec <any-running-server>-0 -- cypher-shell -u neo4j -p <password> \
      "SHOW SERVERS YIELD name, hosting, requestedHosting, serverId WHERE requestedHosting <> hosting"

    Check that all databases are in their expected state (the query should return no results):

    kubectl exec <any-running-server>-0 -- cypher-shell -u neo4j -p <password> \
      "SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, statusMessage WHERE currentStatus <> requestedStatus RETURN name, address, currentStatus, requestedStatus, statusMessage"
  2. Upgrade a server.

    Upgrade to the target chart’s version (which includes the matching Neo4j image):

    helm upgrade <server-release-name> neo4j/neo4j --version <chart-version> -f <server-values>.yaml

    If you need to use a specific Neo4j image instead of the chart’s default, set image.customImage in the member’s values.yaml before running helm upgrade.

  3. Monitor the rollout.

    kubectl rollout status --watch --timeout=600s statefulset/<server-release-name>
  4. Verify the server has rejoined the cluster.

    Wait for the pod to be ready, then check the server’s state (should show Enabled and Available):

    kubectl exec <server-release-name>-0 -- cypher-shell -u neo4j -p <password> \
      "SHOW SERVERS"

    Confirm all databases on this server are in their expected state:

    kubectl exec <server-release-name>-0 -- cypher-shell -u neo4j -p <password> \
      "SHOW DATABASES YIELD name, address, currentStatus, requestedStatus WHERE currentStatus <> requestedStatus RETURN name, address, currentStatus, requestedStatus"
  5. Move to the next server.

    Once the cluster is healthy again, repeat from step 1 for the next server.

Monitor the logs

After each server restarts, monitor the logs for errors or warnings:

kubectl logs <server-release-name>-0 --follow

Protecting cluster availability during worker node upgrades (PodDisruptionBudgets)

Kubernetes platforms that perform automated worker node upgrades (e.g., Kubermatic, GKE node auto-upgrades, EKS managed node groups) may drain and replace multiple nodes simultaneously. Without a PDB, all Neo4j pods can be evicted at once, causing cluster downtime.

Per-release PDB (built-in)

The Neo4j Helm chart includes an optional PDB that can be enabled per release:

podDisruptionBudget:
  enabled: true
  minAvailable: 1

However, because each cluster server is a separate Helm release with its own StatefulSet, each per-release PDB only protects a single pod. A minAvailable: 1 PDB on a single-pod StatefulSet effectively prevents that pod from being evicted at all, which blocks worker node draining entirely.

The per-release PDB is not suitable for controlling the pace of worker node rolling upgrades across a cluster.

To ensure that at most one Neo4j pod is evicted at a time during worker node upgrades, create a single PDB that spans all cluster servers using a shared label.

All Neo4j pods in a cluster share the label helm.neo4j.com/neo4j.name: <neo4j.name>, where <neo4j.name> is the value set in neo4j.name in your values.yaml.

Important considerations
  • Create the PDB before initiating worker node upgrades.

  • The PDB must be in the same namespace as the Neo4j pods.

  • If you add or remove cluster’s servers, review and update the PDB accordingly (especially if using minAvailable).

  • This PDB is managed outside of the Helm chart lifecycle — you must create, update, and delete it manually or via your own automation.

Example: cluster-wide PDB for a 3-server cluster

If your cluster uses neo4j.name: my-cluster, create the following PDB:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: neo4j-cluster-pdb
  namespace: <namespace>
spec:
  minAvailable: 2
  selector:
    matchLabels:
      helm.neo4j.com/neo4j.name: "my-cluster"

This ensures that Kubernetes will only evict one Neo4j pod at a time, because at least two out of three must remain available. Worker node upgrade controllers (Kubermatic, GKE, etc.) respect this constraint and drain nodes sequentially.

Choosing minAvailable vs maxUnavailable

Setting Value Effect

minAvailable

2 (for a 3-server cluster)

At least 2 servers must remain running at all times.

maxUnavailable

1

At most 1 server can be down at any time.

Both achieve the same result for a 3-server cluster. Use whichever is clearer for your operational model. Note that if you scale the cluster by adding more servers, a maxUnavailable: 1 PDB automatically adapts, while minAvailable: 2 needs to be updated.