There is a wide range of ways to import data from files into a Neo4j instance. This page describes the most common ways to import data into a Neo4j instance running on a Kubernetes cluster.
Importing data into Neo4j on Kubernetes
The Neo4j Helm chart configures a volume mount at /import as the Neo4j import directory, as described in Default file locations. You place all the files that you want to import in this volume.
To import data from CSV files into Neo4j, use the command
neo4j-admin database import or the Cypher query
neo4j-admin database importcommand can be used to do batch imports of large amounts of data into a previously unused database and can only be performed once per database.
LOAD CSVCypher statement can be used to import small to medium-sized CSV files into an existing database.
LOAD CSVcan be run as many times as needed and does not require an empty database. For a simple example, see Getting Started Guide → Import data.
Depending on your Neo4j configuration, some methods support fetching data to import from a remote location (e.g., using HTTP or fetching from cloud object storage). Therefore, it is not always necessary to place the source data files in the Neo4j import directory.
Configure the import volume mount
The default configuration of the
/import volume mount is to share the
/data volume mount.
Generally, this is sufficient, and it is unnecessary to explicitly configure an import volume in the Helm deployment’s values.yaml file.
For the full details of configuring volume mounts for a Neo4j Helm deployment, see Volume mounts and persistent volumes.
This example shows how to configure
/import to use a dynamically provisioned Persistent Volume of the default
volumes: import: mode: "defaultStorageClass" defaultStorageClass: requests: storage: 100Gi
Copy files to the import volume using
Files can be copied to the import volume using
This example shows how to copy a local directory
/import/files-1 to a Neo4j instance with the release name
my-graph-db in the namespace
kubectl cp my-files/ default/my-graph-db-0:/import/files-1 # Validate: list the contents of /import/files-1 kubectl exec my-graph-db-0 -- ls /import/files-1
Instead of using
kubectl cp, data can also be loaded onto the
/import directory by:
using an additional container or
initContainerto load data.
kubectl execto run commands to load data.
mounting a volume that is already populated with data.
Data must be placed in the volume’s
neo4j-admin database import
The simplest way to run
neo4j-admin database import is to use
kubectl exec to run it in the Neo4j container.
neo4j-admin database import to perform a large import in the same container as the Neo4j process may cause resource contention problems, including causing either or both processes to be OOM Killed by the node operating system.
To avoid this, either use a separate container or
initContainer or place the Neo4j Helm deployment in offline maintenance mode to run
neo4j-admin database import.
neo4j-admin database import cannot be used to replace an existing database while Neo4j is running.
To replace an existing database, either
DROP the database or put the Neo4j Helm deployment into offline maintenance mode before running
neo4j-admin database import.
An alternative approach to importing data into Neo4j is to run a separate Neo4j standalone instance outside Kubernetes, perform the import on that Neo4j instance, and then copy the resulting database into the Kubernetes-based Neo4j instance using the backup and restore or dump and load procedures.
Was this page helpful?