Configure a Neo4j Helm deployment

This section describes how to configure a Neo4j Helm deployment to run in a Kubernetes cluster.

Helm is different from “package managers”, such as apt, yum, and npm, because, in addition to installing applications, Helm allows rich configuration of applications. The customized configuration should be expressed declaratively in a YAML formatted file, and then passed during installation.

For more information, see Helm official documentation.

1. Create a custom values.yaml file

  1. Ensure your Neo4j Helm Chart repository is up to date and get the latest charts. For more information, see Configure the Neo4j Helm Chart repository.

  2. To see what options are configurable on the Neo4j helm chart that you want to deploy, use helm show values and the Helm Chart, such as neo4j/neo4j-standalone, neo4j/neo4j-cluster-core, neo4j/neo4j-cluster-read-replica, neo4j/neo4j-cluster-headless-service, and neo4j/neo4j-cluster-loadbalancer. For example:

    helm show values neo4j/neo4j-standalone
    # Default values for Neo4j.
    # This is a YAML-formatted file.
    
    neo4j:
      # Name of your cluster
      name: ""
    
      # If the password is not set or empty a random password will be generated during installation
      password: ""
    
      # Neo4j Edition to use (community|enterprise)
      edition: "community"
      # set edition: "enterprise" to use Neo4j Enterprise Edition
      #
      # To use Neo4j Enterprise Edition you must have a Neo4j license agreement.
      #
      # More information is also available at: https://neo4j.com/licensing/
      # Email inquiries can be directed to: licensing@neo4j.com
      #
      # Set acceptLicenseAgreement: "yes" to confirm that you have a Neo4j license agreement.
      acceptLicenseAgreement: "no"
      #
      # set offlineMaintenanceModeEnabled: true to restart the StatefulSet without the Neo4j process running
      # this can be used to perform tasks that cannot be performed when Neo4j is running such as `neo4j-admin dump`
      offlineMaintenanceModeEnabled: false
      #
      # set resources for the Neo4j Container. The values set will be used for both "requests" and "limit".
      resources:
        cpu: "1000m"
        memory: "2Gi"
    
    # Volumes for Neo4j
    volumes:
      data:
        # REQUIRED: specify a volume mode to use for data
        # Valid values are share|selector|defaultStorageClass|volume|volumeClaimTemplate|dynamic
        # To get up and running quickly, for development or testing, use "defaultStorageClass" for a dynamically provisioned volume of the default storage class.
        mode: ""
    
        # Only used if the mode is set to "selector"
        # Will attach to existing volumes that match the selector
        selector:
          storageClassName: "manual"
          accessModes:
            - ReadWriteOnce
          requests:
            storage: 100Gi
          # A helm template to generate a label selector to match existing volumes n.b. both storageClassName and label selector must match existing volumes
          selectorTemplate:
            matchLabels:
              app: "{{ .Values.neo4j.name }}"
              helm.neo4j.com/volume-role: "data"
    
        # Only used if mode is set to "defaultStorageClass"
        # Dynamic provisioning using the default storageClass
        defaultStorageClass:
          accessModes:
            - ReadWriteOnce
          requests:
            storage: 10Gi
    
        # Only used if the mode is set to "dynamic"
        # Dynamic provisioning using the provided storageClass
        dynamic:
          storageClassName: "neo4j"
          accessModes:
            - ReadWriteOnce
          requests:
            storage: 100Gi
    
        # Only used if mode is set to "volume"
        # Provide an explicit volume to use
        volume:
          # If set an init container (running as root) will be added that runs:
          #   `chown -R <securityContext.fsUser>:<securityContext.fsGroup>` AND `chmod -R g+rwx`
          # on the volume. This is useful for some file systems (e.g. NFS) where Kubernetes fsUser or fsGroup settings are not respected
          setOwnerAndGroupWritableFilePermissions: false
    
          # Example (using a specific Persistent Volume Claim)
          # persistentVolumeClaim:
          #   claimName: my-neo4j-pvc
    
        # Only used if mode is set to "volumeClaimTemplate"
        # Provide an explicit volumeClaimTemplate to use
        volumeClaimTemplate: {}
    
      # provide a volume to use for backups
      # n.b. backups will be written to /backups on the volume
      # any of the volume modes shown above for data can be used for backups
      backups:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for logs
      # n.b. logs will be written to /logs/$(POD_NAME) on the volume
      # any of the volume modes shown above for data can be used for logs
      logs:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for csv metrics (csv metrics are only available in Neo4j Enterprise Edition)
      # n.b. metrics will be written to /metrics/$(POD_NAME) on the volume
      # any of the volume modes shown above for data can be used for metrics
      metrics:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for import storage
      # n.b. import will be mounted to /import on the underlying volume
      # any of the volume modes shown above for data can be used for import
      import:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for licenses
      # n.b. licenses will be mounted to /licenses on the underlying volume
      # any of the volume modes shown above for data can be used for licenses
      licenses:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
    # Services for Neo4j
    services:
      # A ClusterIP service with the same name as the Helm Release name should be used for Neo4j Driver connections originating inside the
      # Kubernetes cluster.
      default:
        # Annotations for the K8s Service object
        annotations: { }
    
      # A LoadBalancer Service for external Neo4j driver applications and Neo4j Browser
      neo4j:
        enabled: true
    
        # Annotations for the K8s Service object
        annotations: { }
    
        spec:
          # Type of service.
          type: LoadBalancer
    
          # in most cloud environments LoadBalancer type will receive an ephemeral public IP address automatically. If you need to specify a static ip here use:
          # loadBalancerIP: ...
    
        # ports to include in neo4j service
        ports:
          http:
            enabled: true #Set this to false to remove HTTP from this service (this does not affect whether http is enabled for the neo4j process)
          https:
            enabled: true #Set this to false to remove HTTPS from this service (this does not affect whether https is enabled for the neo4j process)
          bolt:
            enabled: true #Set this to false to remove BOLT from this service (this does not affect whether https is enabled for the neo4j process)
    
      # A service for admin/ops tasks including taking backups
      # This service is available even if the deployment is not "ready"
      admin:
        enabled: true
        # Annotations for the admin service
        annotations: { }
        spec:
          type: ClusterIP
        # n.b. there is no ports object for this service. Ports are autogenerated based on the neo4j configuration
    
      # A "headless" service for admin/ops and Neo4j cluster-internal communications
      # This service is available even if the deployment is not "ready"
      internals:
        enabled: false
        # Annotations for the internals service
        annotations: { }
        # n.b. there is no ports object for this service. Ports are autogenerated based on the neo4j configuration
    
    # Neo4j Configuration (yaml format)
    config:
      dbms.config.strict_validation: "true"
    
    # securityContext defines privilege and access control settings for a Pod or Container. Making sure that you do not run Neo4j as root user.
    securityContext:
      runAsNonRoot: true
      runAsUser: 7474
      runAsGroup: 7474
      fsGroup: 7474
      fsGroupChangePolicy: "Always"
    
    # Readiness probes are set to know when a container is ready to be used.
    # Because Neo4j uses Java these values are large to distinguish between long Garbage Collection pauses (which don't require a restart) and an actual failure.
    # These values should mark Neo4j as not ready after at most 5 minutes of problems (20 attempts * max 15 seconds between probes)
    readinessProbe:
      failureThreshold: 20
      timeoutSeconds: 10
      periodSeconds: 5
    
    # Liveness probes are set to know when to restart a container.
    # Because Neo4j uses Java these values are large to distinguish between long Garbage Collection pauses (which don't require a restart) and an actual failure.
    # These values should trigger a restart after at most 10 minutes of problems (40 attempts * max 15 seconds between probes)
    livenessProbe:
      failureThreshold: 40
      timeoutSeconds: 10
      periodSeconds: 5
    
    # Startup probes are used to know when a container application has started.
    # If such a probe is configured, it disables liveness and readiness checks until it succeeds
    # When restoring Neo4j from a backup, it's important that the startup probe gives time for Neo4j to recover and/or upgrade store files
    # When using Neo4j clusters, it's important that the startup probe gives the Neo4j cluster time to form
    startupProbe:
      failureThreshold: 1000
      periodSeconds: 5
    
    # top level setting called ssl to match the "ssl" from "dbms.ssl.policy"
    ssl:
      # setting per "connector" matching neo4j config
      bolt:
        privateKey:
          secretName:  # we set up the template to grab `private.key` from this secret
          subPath:  # we specify the privateKey value name to get from the secret
        publicCertificate:
          secretName:  # we set up the template to grab `public.crt` from this secret
          subPath:  # we specify the publicCertificate value name to get from the secret
        trustedCerts:
          sources: [ ] # a sources array for a projected volume - this allows someone to (relatively) easily mount multiple public certs from multiple secrets for example.
        revokedCerts:
          sources: [ ]  # a sources array for a projected volume
      https:
        privateKey:
          secretName:
          subPath:
        publicCertificate:
          secretName:
          subPath:
        trustedCerts:
          sources: [ ]
        revokedCerts:
          sources: [ ]
    
    # Kubernetes cluster domain suffix
    clusterDomain: "cluster.local"
    
    # Override image settings in Neo4j pod
    image:
      imagePullPolicy: IfNotPresent
      # set a customImage if you want to use your own docker image
      # customImage: my-image:my-tag
    
    # additional environment variables for the Neo4j Container
    env: {}
    
    # Other K8s configuration to apply to the Neo4j pod
    podSpec:
      # Anti Affinity
      # If set to true then an anti-affinity rule is applied to prevent database pods with the same `neo4j.name` running on a single Kubernetes node.
      # If set to false then no anti-affinity rules are applied
      # If set to an object then that object is used for the Neo4j podAntiAffinity
      podAntiAffinity: true
    
      # Name of service account to use for the Neo4j Pod (optional)
      # this is useful if you want to use Workload Identity to grant permissions to access cloud resources e.g. cloud object storage (AWS S3 etc.)
      serviceAccountName: ""
    
      # How long the Neo4j pod is permitted to keep running after it has been signaled by Kubernetes to stop. Once this timeout elapses the Neo4j process is forcibly terminated.
      # A large value is used because Neo4j takes time to flush in-memory data to disk on shutdown.
      terminationGracePeriodSeconds: 3600
    
      # initContainers for the Neo4j pod
      initContainers: [ ]
    
      # additional runtime containers for the Neo4j pod
      containers: [ ]
    
    # print the neo4j user password set during install to the `helm install` log
    logInitialPassword: true
    
    # Jvm configuration for Neo4j
    jvm:
      # If true any additional arguments are added after the Neo4j default jvm arguments.
      # If false Neo4j default jvm arguments are not used.
      useNeo4jDefaultJvmArguments: true
      # additionalJvmArguments is a list of strings. Each jvm argument should be a separate element
      additionalJvmArguments: []
      # - "-XX:+HeapDumpOnOutOfMemoryError"
      # - "-XX:HeapDumpPath=/logs/neo4j.hprof"

    You can amend any of these settings. Passing that file during installation overrides the default Helm Chart configuration of the Neo4j installation on Kubernetes and the configuration of the Neo4j database itself.

  3. Create the neo4j-values.yaml file with your preferred configuration. For example:

    # neo4j-values.yaml
    
    neo4j:
      password: "my-password"
      resources:
        cpu: "2"
        memory: "5Gi"
    
    volumes:
      data:
        mode: "defaultStorageClass"
    
    # Neo4j configuration (yaml format)
    config:
      dbms.default_database: "neo4j"
      dbms.config.strict_validation: "true"
  4. Pass the neo4j-values.yaml file during installation.

    helm install <release-name> neo4j/neo4j-standalone -f neo4j-values.yaml

    To see the values that have been set for a given release, use helm get values <release-name>.

    Some examples of possible K8s configurations
    • Configure (or disable completely) the Kubernetes LoadBalancer that exposes Neo4j outside the Kubernetes cluster by modifying the externalService object in the values.yml file.

    • Set the securityContext used by Neo4j Pods by modifying the securityContext object in the values.yml file.

    • Configure manual persistent volume provisioning or set the StorageClass to be used as the Neo4j persistent storage.

    Some examples of possible Neo4j configurations
    • All Neo4j configuration (neo4j.conf) settings can be set directly on the config object in the values.yaml file.

    • Neo4j can be configured to use SSL certificates contained in Kubernetes Secrets by modifying the ssl object in the values file.

2. Set Neo4j configuration

The Neo4j Helm Chart does not use a neo4j.conf file. Instead, the Neo4j configuration is set in the Helm deployment’s values.yaml file under the config object.

The config object should contain a string map of neo4j.conf setting name to value. For example, this config object configures the Neo4j metrics:

# Neo4j configuration (yaml format)
config:
  metrics.enabled: "true"
  metrics.namespaces.enabled: "false"
  metrics.csv.interval: "10s"
  metrics.csv.rotation.keep_number: "2"
  metrics.csv.rotation.compression: "NONE"

All Neo4j config values must be YAML strings. It is important to put quotes around the values, such as "true", "false", and "2", so that they are handled correctly as strings.

All neo4j.conf settings are supported except for dbms.jvm.additional. Additional JVM settings can be set on the jvm object in the Helm deployment values.yaml file, as shown in the example:

# Jvm configuration for Neo4j
jvm:
  additionalJvmArguments:
  - "-XX:+HeapDumpOnOutOfMemoryError"
  - "-XX:HeapDumpPath=/logs/neo4j.hprof"

To find out more about configuring Neo4j and the neo4j.conf file, see Configuration and The neo4j.conf file.

3. Set an initial password

You can set an initial password for accessing Neo4j in the values.yaml file. If no initial password is set, the Neo4j helm chart will automatically generate one. In cluster deployments, the same password must be set on all cluster members.

neo4j:
 # If not set or empty a random password will be generated
 password: ""

The password will be printed out in the Helm install output, unless --set logInitialPassword=false is used.

The initial Neo4j password is stored in a Kubernetes Secret. The password can be extracted from the Secret using this command:

kubectl get secret <release-name>-auth -oyaml | yq -r '.data.NEO4J_AUTH' | base64 -d

To change the initial password, follow the steps in Operations - Reset the Neo4j user password.

Once you change the password in Neo4j, the password stored in Kubernetes Secrets will still exist but will no longer be valid.

4. Configure SSL

Neo4j SSL Framework can be used with Neo4j Helm Charts. SSL public certificates and private keys to use with a Neo4j Helm deployment must be stored in Kubernetes Secrets.

To enable Neo4j SSL policies, configure the ssl.<policy name> object in the Neo4j Helm deployment’s values.yaml file to reference the Kubernetes Secrets containing the SSL certificates and keys to use. This example shows how to configure the bolt SSL policy:

ssl:
 bolt:
   privateKey:
     secretName: bolt-cert
     subPath: private.key
   publicCertificate:
     secretName: bolt-cert
     subPath: public.crt

SSL policy objects can be specified for bolt, https, fabric, and backup.

When a private key is specified in the values.yaml file, the Neo4j ssl policy is enabled automatically. To disable a policy, add dbms.ssl.policy.{{ $name }}.enabled: "false" to the config object.

Unencrypted http is not disabled automatically when https is enabled. If https is enabled, add dbms.connector.http.enabled: "false" to the config object to disable http.

5. Configure resource allocation

CPU and memory

The resources (CPU, memory) for the Neo4j container are configured by setting neo4j.resources object in the values.yaml file. In the resource requests, you can specify how much CPU and memory the Neo4j container needs, while in the resource limits, you can set a limit on these resources in case the container tries to use more resources than its requests allow.

neo4j:
  resources:
    requests:
     cpu: "1000m"
     memory: "2Gi"
    limits:
     cpu: "2000m"
     memory: "4Gi"

If no resource requests and resource limits are specified, the values set in the resources object are used for both the Neo4j container’s resource requests and resource limits.

neo4j:
  resources:
    cpu: "2"
    memory: "5Gi"

The minimum for a Neo4j instance is 0.5 CPU and 2GB memory.
If invalid or less than the minimum values are provided, Helm will throw an error, for example:

Error: template: neo4j-standalone/templates/_helpers.tpl:157:11: executing "neo4j.resources.evaluateCPU" at <fail (printf "Provided cpu value %s is less than minimum. \n %s" (.Values.neo4j.resources.cpu) (include "neo4j.resources.invalidCPUMessage" .))>: error calling fail: Provided cpu value 0.25 is less than minimum.
 cpu value cannot be less than 0.5 or 500m
JVM heap and page cache

You configure Neo4j to use the memory provided to the container by setting the parameters dbms.memory.heap.max_size and dbms.memory.pagecache.size. Combined, they must not exceed the memory configuration of the Neo4j container.
In Kubernetes, running processes in the Neo4j container that exceed the configured memory limit are killed by the underlying operating system. Therefore, it is recommended to allow an additional 1GB of memory headroom so that heap + pagecache + 1GB < available memory.

For example, a 5GB container could be configured like this:

neo4j:
  resources:
    cpu: "2"
    memory: "5Gi"

# Neo4j configuration (yaml format)
config:
  dbms.memory.heap.initial_size: "3G"
  dbms.memory.heap.max_size: "3G"
  dbms.memory.pagecache.size: "1G"

dbms.memory.pagecache.size and dbms.memory.heap.initial_size are not the only settings available in Neo4j to manage memory usage. For full details of how to configure memory usage in Neo4j, see Performance - Memory Configuration.

6. Configure a service account

In some deployment situations, it may be desirable to assign a Kubernetes Service Account to the Neo4j pod. For example, if processes in the pod want to connect to services that require Service Account authorization. To configure the Neo4j pod to use a Kubernetes service account, set podSpec.serviceAccountName to the name of the service account to use.

For example:

# neo4j-values.yaml
neo4j:
  password: "my-password"

podSpec:
  serviceAccountName: "neo4j-service-account"

The service account must already exist. The Neo4j Helm Charts do not create or configure Service Accounts.

7. Configure a custom container image

The helm chart uses the official Neo4j Docker image that matches the version of the Helm Chart. To configure the helm chart to use a different container image, set the image.customImage property in the values.yaml file.

This can be necessary when public container repositories are not accessible for security reasons. For example, this values.yaml file configures Neo4j to use my-container-repository.io as the container repository:

# neo4j-values.yaml
neo4j:
  password: "my-password"

image:
  customImage: "my-container-repository.io/neo4j:4.4-enterprise"

8. Configure and install APOC core only

APOC core is shipped with Neo4j, but it is not installed in the Neo4j plugins directory. If APOC core is the only plugin that you want to add to Neo4j, it is not necessary to perform plugin installation as described in Install Plugins. Instead, you can configure the helm deployment to use APOC core by upgrading the deployment with this additional setting in the values.yaml file:

  1. Configure APOC core:

    config:
      dbms.directories.plugins: "/var/lib/neo4j/labs"
      dbms.security.procedures.unrestricted: "apoc.*"
  2. Run helm upgrade to apply the changes:

    helm upgrade <release-name> neo4j/neo4j-standalone -f values.yaml
  3. After the Helm upgrade rollout is complete, check APOC core by running the following Cypher query using cypher-shell or Neo4j Browser:

    RETURN apoc.version()

9. Install Plugins

There are two recommended methods for adding Neo4j plugins to Neo4j Helm Chart deployments. You can use:

9.1. Add plugins using a custom container image

The best method for adding plugins to Neo4j running in Kubernetes is to create a new Docker container image that contains both Neo4j and the Neo4j plugins. This way, you can ensure when building the container that the correct plugin version for the Neo4j version of the container is used, and the resulting image encapsulates all Neo4j runtime dependencies.

Building a Docker container image that is based on the official Neo4j Docker image and does not override the official image’s ENTRYPOINT and COMMAND is the recommended method to use with the Neo4j Helm Chart, as shown in this example Dockerfile:

ARG  NEO4J_VERSION
FROM neo4j:{NEO4J_VERSION}

# copy my-plugins into the Docker image
COPY my-plugins/ /var/lib/neo4j/plugins

# install the apoc core plugin that is shipped with Neo4j
RUN cp /var/lib/neo4j/labs/apoc-* /var/lib/neo4j/plugins

Once the docker image has been built, push it to a container repository that is accessible to your Kubernetes cluster.

CONTAINER_REPOSITORY="my-container-repository.io"
IMAGE_NAME="my-neo4j"

# export this so that it's accessible as a docker build arg
export NEO4J_VERSION=4.4.6-enterprise

docker build --build-arg NEO4J_VERSION --tag ${CONTAINER_REPOSITORY}/${IMAGE_NAME}:${NEO4J_VERSION} .
docker push ${CONTAINER_REPOSITORY}/${IMAGE_NAME}:${NEO4J_VERSION}

To use the image that you have created, in the Neo4j Helm deployment’s values.yaml file, set image.customImage to use the image. For more details, see Configure a custom container image.

Many plugins require additional Neo4j configuration to work correctly. Plugin configuration should be set on the config object in the Helm deployment’s values.yaml file. In some cases, plugin configuration can cause Neo4j’s strict config validation to fail. Strict config validation can be disabled by setting dbms.config.strict_validation: "false".

9.2. Add plugins using a plugins volume

An alternative method for adding Neo4j plugins to a Neo4j Helm deployment uses a plugins volume mount. With this method, the plugin jar files are stored on a Persistent Volume that is mounted to the /plugins directory of the Neo4j container.

The simplest way to set up a persistent plugins volume is to share the Persistent Volume that is used for storing Neo4j data. This example shows how to configure that in the Neo4j Helm deployment values.yaml file:

# neo4j-values.yaml
volumes:
  data:
    # your data volume configuration
    ...

  plugins:
    mode: "share"
    share:
      name: "data"

Details of different ways to configure volume mounts are covered in Mapping volume mounts to persistent volumes.

The Neo4j container now has an empty /plugins directory backed by a persistent volume. Plugin jar files can be copied onto the volume using kubectl cp. Because it is backed by a persistent volume, plugin files will persist even if the Neo4j pod is restarted or moved.

Neo4j only loads plugins on startup. Therefore, you have to restart the Neo4j pod to load them once all plugins are in place. For example:

# Copy plugin files into Neo4j container
kubectl cp my-plugins/* <namespace>/<neo4j-pod-name>:/plugins/

# Restart Neo4j
kubectl rollout restart statefulset/<neo4j-statefulset-name>

# Verify plugins are still present after restart
kubectl exec <neo4j-pod-name> -- ls /plugins