Import data

This tutorial will demonstrate how to import data from CSV files using LOAD CSV.

For a full description of LOAD CSV , see Cypher Manual → LOAD CSV.

With LOAD CSV we can conveniently import data into Neo4j and have access to Cypher to perform actions on the data as desired.

In this example, we are given the following CSV files:

  • persons.csv, a list of persons:

    1,Charlie Sheen
    2,Michael Douglas
    3,Martin Sheen
    4,Morgan Freeman
  • movies.csv, a list of movies:

    1,Wall Street,USA,1987
    2,The American President,USA,1995
    3,The Shawshank Redemption,USA,1994
  • roles.csv, a list of which role was played by some of these persons in each movie:

    1,1,Bud Fox
    4,1,Carl Fox
    3,1,Gordon Gekko
    4,2,A.J. MacInerney
    3,2,President Andrew Shepherd
    5,3,Ellis Boyd 'Red' Redding

By inspecting the files we can see that:

  • Each person has a unique id and a name.

  • Each movie has a unique id, a title, a country where it was made, and a year when it was released.

  • Using the roles file we can deduct which person has acted in which movie, and what role(s) they played.

We can come up with the following simple data model:


Before starting our imports, we will prepare our database by creating indexes and constraints. Since we expect the id property on Person and Movie to be unique in each set, we will create a unique constraint. This protects us from invalid data since constraint creation will fail if there are multiple nodes with the same id property.

Creating a unique constraint also implicitly creates a unique index. The id property is a temporary property used to look up the appropriate nodes for a relationship when importing the third file. By indexing the id property, node lookup (e.g. by MATCH) will be much faster.


Additionally, we create an index on the name property on Country nodes to ensure fast lookups:

CREATE INDEX ON :Country(name)
When using MERGE or MATCH with LOAD CSV we need to make sure we have an index or a unique constraint on the property that we are merging on. This will ensure that the query executes in a performant way.

In this example, the CSV files are stored in the default import directory on the database server, and we can access them using a file:/// URL. Other locations are configurable, and additionally, LOAD CSV supports accessing CSV files via HTTPS, HTTP, and FTP. For complete instructions, see Cypher Manual → LOAD CSV.

Using the following Cypher queries, we will create a node for each person, a node for each movie and a relationship between the two with a property denoting the role. We are also keeping track of the country in which each movie was made.

Let’s start with importing the persons.csv file. Here is the Cypher used to do the import:

LOAD CSV WITH HEADERS FROM "file:///persons.csv" AS csvLine
CREATE (p:Person {id: toInteger(, name:})

Now, let’s import the movies. This time, we are also creating a relationship to the country in which the movie was made. We are using MERGE to create nodes that represent countries. Using MERGE avoids creating duplicate country nodes in the case where multiple movies have been made in the same country.

LOAD CSV WITH HEADERS FROM "file:///movies.csv" AS csvLine
MERGE (country:Country {name:})
CREATE (movie:Movie {id: toInteger(, title: csvLine.title, year:toInteger(csvLine.year)})
CREATE (movie)-[:MADE_IN]->(country)

Finally, we will create relationships between the persons and the movies; one actor can participate in many movies, and one movie has many actors in it. Now importing the relationships is a matter of finding the nodes and then creating relationships between them.

LOAD CSV WITH HEADERS FROM "file:///roles.csv" AS csvLine
MATCH (person:Person {id: toInteger(csvLine.personId)}),(movie:Movie {id: toInteger(csvLine.movieId)})
CREATE (person)-[:PLAYED {role: csvLine.role}]->(movie)
For larger data files, it is useful to use the hint USING PERIODIC COMMIT clause of LOAD CSV. This hint tells Neo4j that the query might build up inordinate amounts of transaction state, and so needs to be periodically committed. For more information, see PERIODIC COMMIT query hint.

Finally, since the id property was only necessary to import the relationships, we can drop the constraints and the id property from all movie and person nodes.

WHERE n:Person OR n:Movie

This is the resulting graph: