Data importInfinigraphNot available on AuraIntroduced in 2025.12
There are several ways to import data into a sharded property database, depending on your use case and the size of your dataset.
-
Initial import from delimited files into a sharded property database using
neo4j-admin database import full. -
Loading data incrementally into an existing sharded property database using
neo4j-admin database import incremental. -
Importing data using
LOAD CSVfor small to medium-sized datasets.
For creating a sharded property database from an existing database or backup, see Creating a sharded database from a URI (online) and Resharding databases.
Initial import from delimited files (offline)
You can use the neo4j-admin database import full command to import data from delimited files into a sharded property database as in a standard Neo4j database.
This is particularly useful for large datasets that you want to import in bulk before starting your application or for incremental imports later on.
You can specify the --property-shard-count option to define the number of property shards you want to create.
This will help distribute the data across multiple servers in a Neo4j cluster.
|
If you are creating the property shards on a self-managed server, the server that executes the |
Import using S3 (offline)
The following example shows how to import a set of CSV files, back them up to S3 using the --target-location and --target-format options, and then create a database using those seeds in S3.
-
Using the
neo4j-admin database importcommand, import data into thefoo-shardeddatabase, creating one graph shard and three property shards. If theneo4j-adminprocess is running on the same server as a Neo4j DBMS process, the Neo4j DBMS process must be stopped. The--target-locationand--target-formatoptions take the outputs of the import, turn them into uncompressed backups, and upload them to a location ready to be seeded from.neo4j-admin database import full foo-sharded --nodes=nodes.csv --nodes=movies.csv --relationships=relationships.csv --input-type=csv --property-shard-count=3 --schema=schema.cypher --target-location=s3://bucket/folder/ --target-format=backup -
Create the database
foo-shardedas a sharded property database by seeding it from your backups in the AWS S3 bucket:CREATE DATABASE `foo-sharded` DEFAULT LANGUAGE CYPHER 25 PROPERTY SHARDS { COUNT 3 } OPTIONS { seedUri: `s3://bucket/folder/` };
Import using local file system (offline)
You can import data into a Neo4j cluster that has no access to any cloud.
-
Using the
neo4j-admin database importcommand, import data into thefoo-shardeddatabase, creating one graph shard and three property shards. If theneo4j-adminprocess is running on the same server as a Neo4j DBMS process, the Neo4j DBMS process must be stopped.neo4j-admin database import full foo-sharded --nodes=nodes.csv --nodes=movies.csv --relationships=relationships.csv --input-type=csv --property-shard-count=3 --schema=schema.cypher --target-format=backup -
Using allow and deny database allocate a single shard to each server in the cluster. See Controlling locations with allowed/denied databases
-
Move the produced backups from the local file system of the machine used for the import to the servers hosting each of the shards. Each server should have one backup, and the backups must reside in the same path on each server.
-
On each server, update the neo4j.conf to include the correct settings for file seeding as outlined in Create a database from a URI.
-
Create the database
foo-shardedas a sharded property database by seeding it from your backups in the servers file systems:CREATE DATABASE `foo-sharded` DEFAULT LANGUAGE CYPHER 25 PROPERTY SHARDS { COUNT 3 } OPTIONS { seedUri: `file:/backusp/`, seedOptions: 'NO_CHECK' };
In this context, NO_CHECK prevents the seeding process from verifying that all backups are present on all servers.
The cluster automatically distributes the data across its servers. For more information on seed providers, see Create a database from a URI.
Incremental import / offline updates
You can use the neo4j-admin database import incremental command to import data into an existing database.
This is particularly useful for large batches of data that you want to add to an existing sharded property database.
It allows you to do faster updates than is possible transactionally.
-
Stop the
foo-shardeddatabase if it is running. See Starting and stopping a sharded property database for instructions. -
Run the
neo4j-admin database import incrementalcommand by specifying the--property-shard-countoption to define the number of property shards you want to create, the--target-locationand--target-formatoptions to upload the resulting stores to a location ready for re-creating the databases from, and the CSV files that you wish to update your existing data with. See Incremental import for more information and instructions.
neo4j-admin database import incremental foo-sharded --nodes=nodes.csv --nodes=movies.csv --relationships=relationships.csv --input-type=csv --property-shard-count=3 --schema=schema.cypher --target-location=s3://bucket/folder/ --target-format=backup
-
Re-create your database using the
dbms.recreateDatabase()or follow step 2 of Splitting an existing database into shards and creating a new database with the resulting stores the same way you would for a normal offline incremental import.
Importing data using LOAD CSV (online)
You can use LOAD CSV to import data into a sharded property database.
This is especially useful when you want to import small to medium-sized datasets (up to 10 million records) from local and remote files, including cloud URIs.
For more information, see Cypher Manual → LOAD CSV and Getting Started guide → Tutorial: Import CSV data using LOAD CSV.
|
Transactional Cypher statements involving |