Volume mounts and persistent volumes with the Neo4j Helm Charts
This section describes the volume mounts created by the Neo4j Helm Charts and the
PersistentVolume
types that can be used.
1. Volume mounts
A volume mount is part of a Kubernetes Pod spec that describes how and where a volume is mounted within a container.
The Neo4j Helm Chart creates the following volume mounts:
-
backups
mounted at /backups -
data
mounted at /data -
import
mounted at /import -
licenses
mounted at /licenses -
logs
mounted at /logs -
metrics
mounted at /metrics (Neo4j Community Edition does not generatemetrics
)
It is also possible to specify a plugins
volume mount (mounted at /plugins), but this is not created by the default Helm Charts.
For more information, see Add plugins using a plugins volume.
2. Persistent volumes
PersistentVolume
(PV) is a storage resource in the Kubernetes cluster that has a lifecycle independent of any individual pod that uses the PV.
PersistentVolumeClaim
(PVC) is a request for a storage resource by a user.
PVCs consume PV resources.
For more information about what PVs are and how they work, see the Kubernetes official documentation.
The type of PV used and its configuration can have a significant effect on the performance of Neo4j. Some PV types are not suitable for use with Neo4j at all.
The volume type used for the data
volume mount is particularly important.
Neo4j supports the following PV types for the data
volume mount:
-
awsElasticBlockStore
-
azureDisk
-
gcePersistentDisk
-
hostPath
when using Docker Desktop [1].
Neo4j data
volume mounts do not support:
-
azureFile
-
nfs
For volume mounts other than the data
volume mount, generally, all PV types are presumed to work.
It is also not recommended to use an HDD or cloud storage, such as AWS S3 mounted as a drive. |
3. Mapping volume mounts to persistent volumes
By default, the Neo4j Helm Chart uses a single PV, named data
, to support all charts' volume mounts.
The volume used for each volume mount can be changed by modifying the volumes.<volume name>
object in the Helm Chart values.
The Neo4j Helm Chart volumes
object supports different modes:
3.1. mode: share
- Description
-
The volume mount shares the underlying volume from one of the other volume objects.
- Example
-
The
logs
volume mount uses thedata
volume (this is the default behavior).volumes: logs: mode: "share" share: name: "data"
3.2. mode: defaultStorageClass
- Description
-
The volume mount is backed by a PV that Kubernetes dynamically provisions using the default
StorageClass
. - Example
-
A dynamically provisioned
data
volume with a size of10Gi
.volumes: data: mode: "defaultStorageClass" defaultStorageClass: requests: storage: 10Gi
For the |
3.3. mode: dynamic
- Description
-
The volume mount is backed by a PV that Kubernetes dynamically provisions using the specified
StorageClass
. - Example
-
A dynamically provisioned
import
volume with a size of1Ti
using theneo4j
storage class.volumes: import: mode: dynamic dynamic: storageClassName: "neo4j" requests: storage: 1Ti
For the |
3.4. mode: volume
- Description
-
A complete Kubernetes
volume
object can be specified for the volume mount. Generally, volumes specified in this way have to be manually provisioned.volume
can be any valid Kubernetes volume type. This mode can be used in a variety of ways:-
Attach an existing PersistentVolume by name.
-
Attach cloud disks/volumes, e.g.,
gcePersistentDisk
,azureDisk
, orawsElasticBlockStore
without creating Kubernetes PersistentVolumes. -
Attach the contents of a
ConfigMap
orSecret
(as a read-only volume).For details of how to specify
volume
objects, see the Kubernetes documentation.
-
- Example - mount an AWS EBS volume
-
The
data
volume mount, backed by the specified EBS volume. When this method is used, the EBS volume must already exist.volumes: data: mode: volume volume: awsElasticBlockStore: volumeID: "vol-0795be227aff63b2a" fsType: ext4
- Set file permissions on mounted volumes
-
The Neo4j helm chart supports an additional field not present in normal Kubernetes
volume
objects:setOwnerAndGroupWritableFilePermissions: true|false
. If set totrue
, aninitContainer
will be run to modify the file permissions of the mounted volume, so that the contents can be written and read by the Neo4j process. This is to help with certain volume implementations that are not aware of theSecurityContext
set on pods using them. - Example - reference an existing PersistentVolume
-
The
backups
volume mount, backed by the specified PVC. When this method is used, thepersistentVolumeClaim
object must already exist.volumes: backups: mode: volume volume: setOwnerAndGroupWritableFilePermissions: true persistentVolumeClaim: claimName: my-neo4j-pvc
3.5. mode: selector
- Description
-
The volume to use is chosen from the existing PVs based on the provided
selector
object and a PVC, which is dynamically generated.If no matching PVs exist, the Neo4j pod will be unable to start. To match, a PV must have the specified
StorageClass
, match the labelselectorTemplate
, and have sufficient storage capacity to meet the requested storage amount. - Example
-
The
data
volume, chosen from the available volumes with theneo4j
storage class and the labeldeveloper: alice
.volumes: import: mode: selector selector: storageClassName: "neo4j" requests: storage: 128Gi selectorTemplate: matchLabels: developer: "alice"
For the |
3.6. mode: volumeClaimTemplate
- Description
-
A complete Kubernetes
volumeClaimTemplate
object is specified for the volume mount. Generally, volumes specified in this way are dynamically provisioned. For details of how to specifyvolumeClaimTemplate
objects, see the Kubernetes documentation.
In all cases, do not forget to set the |
4. Provision persistent volumes with Neo4j Helm Chart
With the Neo4j Helm Charts, you can provision a PV manually or dynamically, using the default or a custom StorageClass
.
-
Manual provisioning of persistent volumes. Recommended Default
Must be labeled with anapp
label that matches the release name of your Neo4j Helm deployment. -
Dynamic provisioning using the default
StorageClass
. Recommended only for small-scale development work. -
Dynamic provisioning using a dedicated
StorageClass
.
4.1. Provision persistent volumes manually
You provision a PV for Neo4j to use by explicitly creating it (for example, using kubectl create -f persistentVolume.yaml
) before installing the Neo4j Helm release.
If no suitable PV exists, the Neo4j pod will not start.
- Why prefer manual provisioning?
-
-
Manual provisioning provides the strongest protection against the automatic removal of volumes containing critical data.
-
The performance of Neo4j is very dependent on the latency, IOPS capacity, and throughput of the storage it is using. Manual provisioning is the best way to ensure the underlying storage is configured for Neo4j performance.
-
Explicitly configuring the underlying storage before installing Neo4j is worthwhile because changing the underlying storage after installation while preserving the data stored in Neo4j, is difficult and may cause significant Neo4j downtime.
-
4.1.1. Link a Neo4j Helm release to the manually provisioned volumes
A Neo4j Helm release uses only manually provisioned PVs that have:
-
storageClassName set to
manual
-
An
app
label — set in their metadata, which matches the name of the Neo4j Helm release. -
Sufficient storage capacity — the PV capacity must be greater than or equal to the value of
volumes.data.selector.requests.storage
set for the Neo4j Helm release (default is100Gi
).
For example, if the release name is my-neo4j-release
and the requested storage is 100Gi
, then the PV object must have storageClassName
, app
label, and capacity
as shown in this example:
apiVersion: v1
kind: PersistentVolume
metadata:
labels:
app: "my-neo4j-release"
spec:
capacity:
Storage: 100Gi
storageClassName: "manual"
Then, you install the Neo4j release using the same name:
helm install "my-neo4j-release" neo4j/neo4j-standalone
4.1.2. Configure the Neo4j Helm release for manual provisioning
The Neo4j helm chart uses manual provisioning by default, so it is unnecessary to set any chart values explicitly. The following default values are used for manual provisioning:
volumes:
data:
mode: "selector"
selector:
storageClassName: "manual"
requests:
storage: 100Gi
With this method, a PVC is dynamically generated for the manually provisioned PV.
An alternative method for manual provisioning is to use a manually provisioned PVC.
This is supported by the Neo4j Helm Chart using the volume
mode.
For example, to use a pre-existing PVC called my-neo4j-pvc
set these values:
volumes:
data:
mode: "volume"
volume:
persistentVolumeClaim:
claimName: my-neo4j-pvc
4.1.3. Configure manual provisioning of persistent volumes
The instructions for manually provisioning PVs vary according to the type of PV being used and the underlying infrastructure. In general, there are two steps:
-
Create the disk/volume to be used for storage in the underlying infrastructure. For example:
-
If using a
gcePersistentDisk
volume — in Google Compute Engine, create the Persistent Disk. -
If using a
hostPath
volume — on the host node, create the path (directory).
-
-
Create a PV in Kubernetes that references the underlying resource created in step 1.
-
Ensure that the created PV’s
app
label matches the name of the Neo4j Helm release. -
Ensure that the created PV’s
capacity.storage
matches the storage available on the underlying infrastructure.
-
The performance of Neo4j is very dependent on the latency, IOPS capacity, and throughput of the storage it is using. For the best performance of Neo4j, use the best available disks (e.g., SSD) and set IOPS throttling/quotas to high values. For some cloud providers, IOPS throttling is proportional to the size of the volume. In these cases, the best performance is achieved by setting the size of the volume based on the desired IOPS rather than the amount required for data storage. |
4.1.4. Provision a persistent volume
Platform-specific instructions for provisioning PVs can be found in the Create a persistent volume section.
4.1.5. Reuse a persistent volume
After uninstalling the Neo4j Helm Chart, both the PVC and the PV remain and can be reused by a new install of the helm chart.
If you delete the PVC, the PV moves into a Released
status and will not be reusable.
To be able to reuse the PV by a new install of the Neo4j Helm Chart, remove its connection to the previous PVC:
-
Edit the PV by running the following command:
kubectl edit pv <pv-name>
-
Remove the section
spec.claimRef
.
The PV goes back to the Available
status and can be reused by a new install of the Neo4j Helm Chart.
4.2. Provision persistent volumes dynamically
When using dynamic provisioning, the Neo4j release depends on Kubernetes to create a PV on-demand when Neo4j is installed.
For more information on dynamic provisioning, see the Kubernetes official documentation.
- Why use dynamic provisioning?
-
Dynamic provisioning of PV for Neo4j is a good choice for development and test environments, where the ease of installation is more important than flexibility in managing the underlying storage and preservation of the stored data in all situations. With dynamic provisioning, a Neo4j Helm release uses either a specific Kubernetes
StorageClass
or the defaultStorageClass
of the running Kubernetes cluster.Using the default
StorageClass
is the quickest way to spin up and run Neo4j for simple tests, handling small amounts of data. However, it is not recommended for large amounts of data, as it may lead to performance issues.It is recommended to create a dedicated
StorageClass
for Neo4j so that the underlying storage configuration can be specified to match the Neo4j usage as much as possible.
The volumes
object in the Neo4j values.yaml file is used to configure dynamic provisioning.
4.2.1. Use the default StorageClass
to dynamically provision persistent volumes
To use the default StorageClass
and a storage size 100Gi
, set the following values:
volumes:
data:
mode: "defaultStorageClass"
defaultStorageClass:
requests:
storage: 100Gi
4.2.2. Use a dedicated StorageClass
to dynamically provision persistent volumes
To use a dedicated StorageClass
, you define it in a YAML file and create it using kubectl create
.
The permitted specification values depend on the provisioner being used.
Full details of StorageClass
specification are covered in the Kubernetes official documentation.
StorageClass
called neo4j-storage
that has a storage size 100Gi
volumes:
import:
mode: dynamic
dynamic:
storageClassName: "neo4j-storage"
requests:
storage: 1Ti
The performance of Neo4j is very dependent on the latency, IOPS capacity, and throughput of the storage it is using. For the best performance of Neo4j, use the best available disks (e.g., SSD) and set IOPS throttling/quotas to high values. For some cloud providers, IOPS throttling is proportional to the size of the volume. In these cases, the best performance is achieved by setting the size of the volume based on the desired IOPS rather than the amount required for data storage. |
hostPath
volumes.
Was this page helpful?