Configure and install Neo4j using Helm

This section describes how configure Neo4j to run in a Kubernetes cluster using a customized Helm chart.

Helm is different from “package managers”, such as apt, yum, and npm, because in addition to installing applications, Helm allows rich configuration of applications. The customized configuration should be expressed declaratively in a YAML formatted file, and then passed during installation.

For more information, see Helm official documentation.

1. Create a custom values.yaml file

  1. To see what options are configurable on the neo4j/neo4j-standalone chart, use helm show values:

    helm show values neo4j/neo4j-standalone
    # Default values for Neo4j.
    # This is a YAML-formatted file.
    
    neo4j:
      # Name of your cluster
      name: ""
    
      # If password is not set or empty a random password will be generated during installation
      password: ""
    
      # Neo4j Edition to use (community|enterprise)
      edition: "community"
      # set edition: "enterprise" to use Neo4j Enterprise Edition
      #
      # To use Neo4j Enterprise Edition you must have a Neo4j license agreement.
      #
      # More information is also available at: https://neo4j.com/licensing/
      # Email inquiries can be directed to: licensing@neo4j.com
      #
      # Set acceptLicenseAgreement: "yes" to confirm that you have a Neo4j license agreement.
      acceptLicenseAgreement: "no"
      #
      # set offlineMaintenanceModeEnabled: true to restart the StatefulSet without the Neo4j process running
      # this can be used to perform tasks that cannot be performed when Neo4j is running such as `neo4j-admin dump`
      offlineMaintenanceModeEnabled: false
      #
      # set resources for the Neo4j Container. The values set will be used for both "requests" and "limit".
      resources:
        cpu: "1000m"
        memory: "2Gi"
    
    # Volumes for Neo4j
    volumes:
      data:
        # REQUIRED: specify a volume mode to use for data
        # Valid values are share|selector|defaultStorageClass|volume|volumeClaimTemplate|dynamic
        # To get up-and-running quickly, for development or testing, use "defaultStorageClass" for a dynamically provisioned volume of the default storage class.
        mode: ""
    
        # Only used if mode is set to "selector"
        # Will attach to existing volumes that match the selector
        selector:
          storageClassName: "manual"
          accessModes:
            - ReadWriteOnce
          requests:
            storage: 100Gi
          # A helm template to generate a label selector to match existing volumes n.b. both storageClassName and label selector must match existing volumes
          selectorTemplate:
            matchLabels:
              app: "{{ .Values.neo4j.name }}"
              helm.neo4j.com/volume-role: "data"
    
        # Only used if mode is set to "defaultStorageClass"
        # Dynamic provisioning using the default storageClass
        defaultStorageClass:
          accessModes:
            - ReadWriteOnce
          requests:
            storage: 10Gi
    
        # Only used if mode is set to "dynamic"
        # Dynamic provisioning using the provided storageClass
        dynamic:
          storageClassName: "neo4j"
          accessModes:
            - ReadWriteOnce
          requests:
            storage: 100Gi
    
        # Only used if mode is set to "volume"
        # Provide an explicit volume to use
        volume:
          # If set an init container (running as root) will be added that runs:
          #   `chown -R <securityContext.fsUser>:<securityContext.fsGroup>` AND `chmod -R g+rwx`
          # on the volume. This is useful for some filesystems (e.g. NFS) where Kubernetes fsUser or fsGroup settings are not respected
          setOwnerAndGroupWritableFilePermissions: false
    
          # Example (using a specific Persistent Volume Claim)
          # persistentVolumeClaim:
          #   claimName: my-neo4j-pvc
    
        # Only used if mode is set to "volumeClaimTemplate"
        # Provide an explicit volumeClaimTemplate to use
        volumeClaimTemplate: {}
    
      # provide a volume to use for backups
      # n.b. backups will be written to /backups on the volume
      # any of the volume modes shown above for data can be used for backups
      backups:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for logs
      # n.b. logs will be written to /logs/$(POD_NAME) on the volume
      # any of the volume modes shown above for data can be used for logs
      logs:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for csv metrics (csv metrics are only available in Neo4j Enterprise Edition)
      # n.b. metrics will be written to /metrics/$(POD_NAME) on the volume
      # any of the volume modes shown above for data can be used for metrics
      metrics:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for import storage
      # n.b. import will be mounted to /import on the underlying volume
      # any of the volume modes shown above for data can be used for import
      import:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
      # provide a volume to use for licenses
      # n.b. licenses will be mounted to /licenses on the underlying volume
      # any of the volume modes shown above for data can be used for licenses
      licenses:
        mode: "share" # share an existing volume (e.g. the data volume)
        share:
          name: "data"
    
    # Services for Neo4j
    services:
      # A ClusterIP service with the same name as the Helm Release name should be used for Neo4j Driver connections originating inside the
      # Kubernetes cluster.
      default:
        # Annotations for the K8s Service object
        annotations: { }
    
      # A LoadBalancer Service for external Neo4j driver applications and Neo4j Browser
      neo4j:
        enabled: true
    
        # Annotations for the K8s Service object
        annotations: { }
    
        spec:
          # Type of service.
          type: LoadBalancer
    
          # in most cloud environments LoadBalancer type will receive an ephemeral public IP address automatically. If you need to specify a static ip here use:
          # loadBalancerIP: ...
    
        # ports to include in neo4j service
        ports:
          http:
            enabled: true #Set this to false to remove HTTP from this service (this does not affect whether http is enabled for the neo4j process)
          https:
            enabled: true #Set this to false to remove HTTPS from this service (this does not affect whether https is enabled for the neo4j process)
          bolt:
            enabled: true #Set this to false to remove BOLT from this service (this does not affect whether https is enabled for the neo4j process)
    
      # A service for admin/ops tasks including taking backups
      # This service is available even if the deployment is not "ready"
      admin:
        enabled: true
        # Annotations for the admin service
        annotations: { }
        spec:
          type: ClusterIP
        # n.b. there is no ports object for this service. Ports are autogenerated based on the neo4j configuration
    
      # A "headless" service for admin/ops and Neo4j cluster-internal communications
      # This service is available even if the deployment is not "ready"
      internals:
        enabled: false
        # Annotations for the internals service
        annotations: { }
        # n.b. there is no ports object for this service. Ports are autogenerated based on the neo4j configuration
    
    # Neo4j Configuration (yaml format)
    config:
      dbms.config.strict_validation: "true"
    
    # securityContext defines privilege and access control settings for a Pod or Container. Making sure that we dont run Neo4j as root user.
    securityContext:
      runAsNonRoot: true
      runAsUser: 7474
      runAsGroup: 7474
      fsGroup: 7474
      fsGroupChangePolicy: "Always"
    
    # Readiness probes are set to know when a container is ready to be used.
    # Because Neo4j uses Java these values are large to distinguish between long Garbage Collection pauses (which don't require a restart) and an actual failure.
    # These values should mark Neo4j as not ready after at most 5 minutes of problems (20 attempts * max 15 seconds between probes)
    readinessProbe:
      failureThreshold: 20
      timeoutSeconds: 10
      periodSeconds: 5
    
    # Liveness probes are set to know when to restart a container.
    # Because Neo4j uses Java these values are large to distinguish between long Garbage Collection pauses (which don't require a restart) and an actual failure.
    # These values should trigger a restart after at most 10 minutes of problems (40 attempts * max 15 seconds between probes)
    livenessProbe:
      failureThreshold: 40
      timeoutSeconds: 10
      periodSeconds: 5
    
    # Startup probes are used to know when a container application has started.
    # If such a probe is configured, it disables liveness and readiness checks until it succeeds
    # When restoring Neo4j from a backup it's important that startup probe gives time for Neo4j to recover and/or upgrade store files
    # When using Neo4j clusters it's important that startup probe give the Neo4j cluster time to form
    startupProbe:
      failureThreshold: 1000
      periodSeconds: 5
    
    # top level setting called ssl to match the "ssl" from "dbms.ssl.policy"
    ssl:
      # setting per "connector" matching neo4j config
      bolt:
        privateKey:
          secretName:  # we set up the template to grab `private.key` from this secret
          subPath:  # we specify the privateKey value name to get from the secret
        publicCertificate:
          secretName:  # we set up the template to grab `public.crt` from this secret
          subPath:  # we specify the publicCertificate value name to get from the secret
        trustedCerts:
          sources: [ ] # a sources array for a projected volume - this allows someone to (relatively) easily mount multiple public certs from multiple secrets for example.
        revokedCerts:
          sources: [ ]  # a sources array for a projected volume
      https:
        privateKey:
          secretName:
          subPath:
        publicCertificate:
          secretName:
          subPath:
        trustedCerts:
          sources: [ ]
        revokedCerts:
          sources: [ ]
    
    # Kubernetes cluster domain suffix
    clusterDomain: "cluster.local"
    
    # Override image settings in Neo4j pod
    image:
      imagePullPolicy: IfNotPresent
      # set a customImage if you want to use your own docker image
      # customImage: my-image:my-tag
    
    # additional environment variables for the Neo4j Container
    env: {}
    
    # Other K8s configuration to apply to the Neo4j pod
    podSpec:
      # Anti Affinity
      # If set to true then an anti-affinity rule is applied to prevent database pods with the same `neo4j.name` running on a single Kubernetes node.
      # If set to false then no anti-affinity rules are applied
      # If set to an object then that object is used for the Neo4j podAntiAffinity
      podAntiAffinity: true
    
      # Name of service account to use for the Neo4j Pod (optional)
      # this is useful if you want to use Workload Identity to grant permissions to access cloud resources e.g. cloud object storage (AWS S3 etc.)
      serviceAccountName: ""
    
      # How long the Neo4j pod is permitted to keep running after it has been signalled by Kubernetes to stop. Once this timeout elapses the Neo4j process is forcibly terminated.
      # A large value is used because Neo4j takes time to flush in-memory data to disk on shutdown.
      terminationGracePeriodSeconds: 3600
    
      # initContainers for the Neo4j pod
      initContainers: [ ]
    
      # additional runtime containers for the Neo4j pod
      containers: [ ]
    
    # print the neo4j user password set during install to the `helm install` log
    logInitialPassword: true
    
    # Jvm configuration for Neo4j
    jvm:
      # If true any additional arguments are added after the Neo4j default jvm arguments.
      # If false Neo4j default jvm arguments are not used.
      useNeo4jDefaultJvmArguments: true
      # additionalJvmArguments is a list of strings. Each jvm argument should be a separate element
      additionalJvmArguments: []
      # - "-XX:+HeapDumpOnOutOfMemoryError"
      # - "-XX:HeapDumpPath=/logs/neo4j.hprof"

    You can amend any of these settings in a values.yaml file. Passing that file during installation overrides the default Helm chart configuration of the Neo4j installation on Kubernetes and the configuration of the Neo4j database itself.

  2. Create the neo4j-values.yaml file with the your preferred configuration. For example:

    # neo4j-values.yaml
    
    neo4j:
      password: "my-password"
      resources:
        cpu: "2"
        memory: "5Gi"
    
    volumes:
      data:
        mode: "defaultStorageClass"
    
    # Neo4j configuration (yaml format)
    config:
      dbms.default_database: "neo4j"
      dbms.config.strict_validation: "true"
  3. Pass the neo4j-values.yaml file during installation.

    helm install <release-name> neo4j/neo4j-standalone -f neo4j-values.yaml

    To see the values that have been set for a given release, use helm get values <release-name>.

    Some examples of possible K8s configurations
    • Configure (or disable completely) the Kubernetes LoadBalancer that exposes Neo4j outside the Kubernetes cluster by modifying the externalService object in the values.yml file.

    • Set the securityContext used by Neo4j Pods by modifying the securityContext object in the values.yml file.

    • Configure manual persistent volume provisioning or set the StorageClass to be used as the Neo4j persistent storage.

    Some examples of possible Neo4j configurations
    • All Neo4j configuration (neo4j.conf) settings can be set directly on the config object in the values.yaml file.

    • Neo4j can be configured to use SSL certificates contained in Kubernetes Secrets by modifying the ssl object in the values file.

2. Set Neo4j configuration

The Neo4j Helm chart does not use a neo4j.conf file. Instead, the Neo4j configuration is set in the Helm deployment’s values.yaml file under the config object.

The config object should contain a string map of neo4j.conf setting name to value. For example, this config object configures the Neo4j metrics:

config:
  metrics.enabled: "true"
  metrics.namespaces.enabled: "false"
  metrics.csv.interval: "10s"
  metrics.csv.rotation.keep_number: "2"
  metrics.csv.rotation.compression: "NONE"

All Neo4j config values must be YAML strings. It is important to put quotes around the values, such as "true", "false", and "2", so that they are handled correctly as strings.

All neo4j.conf settings are supported except for dbms.jvm.additional. Additional JVM settings can be set on the jvm object in the Helm deployment values.yaml file, as shown in the example:

# Jvm configuration for Neo4j
jvm:
  additionalJvmArguments:
  - "-XX:+HeapDumpOnOutOfMemoryError"
  - "-XX:HeapDumpPath=/logs/neo4j.hprof"

To find out more about configuring Neo4j and the neo4j.conf file, see Configuration and The neo4j.conf file.

3. Set an initial password

You can set initial password for accessing Neo4j in the values.yaml file. If no initial password is set, the Neo4j helm chart will automatically generate one.

neo4j:
 # If not set or empty a random password will be generated
 password: ""

The password will be printed out in the Helm install output, unless --set logInitialPassword=false is used.

The initial Neo4j password is stored in a Kubernetes Secret. The password can be extracted from the Secret using this command:

kubectl get secret <release-name>-auth -oyaml | yq -r '.data.NEO4J_AUTH' | base64 -d

To change the initial password, follow the steps in Maintenance operations - Reset the Neo4j user password.

Once you change the password in Neo4j, the password stored in Kubernetes Secrets will still exist but will no longer be valid.

4. Configure SSL

Neo4j SSL Framework can be used with Neo4j Helm charts. SSL public certificates and private keys to use with a Neo4j Helm deployment must be stored in Kubernetes Secrets.

To enable Neo4j SSL policies, configure the ssl.<policy name> object in the Neo4j Helm deployment’s values.yaml file to reference the Kubernetes Secrets containing the SSL certificates and keys to use. This example shows how to configure the bolt ssl policy:

ssl:
 bolt:
   privateKey:
     secretName: bolt-cert
     subPath: private.key
   publicCertificate:
     secretName: bolt-cert
     subPath: public.crt

SSL policy objects can be specified for bolt, https, fabric, and backup.

When a private key is specified in the values.yaml file, the Neo4j ssl policy is enabled automatically. To disable a policy, add dbms.ssl.policy.{{ $name }}.enabled: "false" to the config object.

Unencrypted http is not disabled automatically when https is enabled. If https is enabled, add dbms.connector.http.enabled: "false" to the config object to disable http.

5. Configure resource allocation

The resources (CPU, memory) for the Neo4j container are configured by setting neo4j.resources object in the values.yaml file. The values set in the resources object are used for the Neo4j container’s resource request and resource limit. For more information, see the Kubernetes container resources documentation.

neo4j:
  resources:
    cpu: "2"
    memory: "5Gi"

Then, you configure Neo4j to make use of the memory provided to the container. In particular, ensure that dbms.memory.heap.max_size and dbms.memory.pagecache.size combined do not exceed the memory configuration of the Neo4j container.
In Kubernetes, if the processes running in the Neo4j container exceed the configured memory limit, then they will be killed by the underlying operating system. To avoid this, a good heuristic is to allow an additional 1GB of memory headroom so that heap + pagecache + 1GB < available memory.

For example, a 5GB container could be configured like this:

neo4j:
  resources:
    cpu: "2"
    memory: "5Gi"

config:
  dbms.memory.heap.initial_size=3G
  dbms.memory.heap.max_size=3G
  dbms.memory.pagecache.size=1G

dbms.memory.pagecache.size and dbms.memory.heap.initial_size are not the only settings available in Neo4j to manage memory usage. For full details of how to configure memory usage in Neo4j, see Performance - Memory Configuration.

6. Configure a service account

In some deployment situations, it may be desirable to assign a Kubernetes Service Account to the Neo4j pod. For example, if processes in the pod want to connect to services that require Service Account authorization. To configure the Neo4j pod to use a Kubernetes service account, set podSpec.serviceAccountName to the name of the service account to use.

For example:

# neo4j-values.yaml
neo4j:
  password: "my-password"

podSpec:
  serviceAccountName: "neo4j-service-account"

The service account must already exist; the Neo4j Helm chart will not create or configure the Service Account.

7. Configure a custom container image

The helm chart uses the official Neo4j Docker image that matches the version of the Helm chart. To configure the helm chart to use a different container image, set the image.customImage property in the values.yaml file.

This can be necessary when public container repositories are not accessible for security reasons. For example, this values.yaml file configures Neo4j to use my-container-repository.io as the container repository:

# neo4j-values.yaml
neo4j:
  password: "my-password"

image:
  customImage: "my-container-repository.io/neo4j:4.3-enterprise"