Name

neo4j-import — Neo4j Import Tool

Synopsis

neo4j-import [options]

DESCRIPTION

neo4j-import is used to create a new Neo4j database from data in CSV files. See the chapter "Import Tool" in the Neo4j Manual for details on the CSV file format — a special kind of header is required.

OPTIONS

--into <store-dir>
Database directory to import into. Must not contain existing database.
--nodes [:Label1:Label2] "<file1>,<file2>,…"
Node CSV header and data. Multiple files will be logically seen as one big file from the perspective of the importer. The first line must contain the header. Multiple data sources like these can be specified in one import, where each data source has its own header. Note that file groups must be enclosed in quotation marks.
--relationships [:RELATIONSHIP_TYPE] "<file1>,<file2>,…"
Relationship CSV header and data. Multiple files will be logically seen as one big file from the perspective of the importer. The first line must contain the header. Multiple data sources like these can be specified in one import, where each data source has its own header. Note that file groups must be enclosed in quotation marks.
--delimiter <delimiter-character>
Delimiter character, or TAB, between values in CSV data. The default option is ,.
--array-delimiter <array-delimiter-character>
Delimiter character, or TAB, between array elements within a value in CSV data. The default option is ;.
--quote <quotation-character>
Character to treat as quotation character for values in CSV data. The default option is ". Quotes inside quotes escaped like """Go away"", he said." and "\"Go away\", he said." are supported. If you have set "'" to be used as the quotation character, you could write the previous example like this instead: '"Go away", he said.'
--id-type <id-type>
One out of [STRING, INTEGER, ACTUAL] and specifies how ids in node/relationship input files are treated. STRING: arbitrary strings for identifying nodes. INTEGER: arbitrary integer values for identifying nodes. ACTUAL: (advanced) actual node ids. The default option is STRING.Default value: STRING
--processors <max processor count>
(advanced) Max number of processors used by the importer. Defaults to the number of available processors reported by the JVM. There is a certain amount of minimum threads needed so for that reason there is no lower bound for this value. For optimal performance this value shouldn’t be greater than the number of available processors.
--stacktrace
Enable printing of error stack traces.
--bad <file name>
Relationships that refer to nodes that cannot be found can, instead of making the import fail, be logged to a file specified by this option. Default value: not-imported.bad
--bad-tolerance <max number of bad entries>
Number of bad entries before the import is considered failed. This tolerance threshold is about relationships refering to missing nodes. Format errors in input data are still treated as errors. Default value: 1000

Usage - Windows

The Neo4jImport.bat script is used in the same way.

EXAMPLES

Below is a basic example, where we import movies, actors and roles from three files.

movies.csv 

movieId:ID,title,year:int,:LABEL
tt0133093,"The Matrix",1999,Movie
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

actors.csv 

personId:ID,name,:LABEL
keanu,"Keanu Reeves",Actor
laurence,"Laurence Fishburne",Actor
carrieanne,"Carrie-Anne Moss",Actor

roles.csv 

:START_ID,role,:END_ID,:TYPE
keanu,"Neo",tt0133093,ACTS_IN
keanu,"Neo",tt0234215,ACTS_IN
keanu,"Neo",tt0242653,ACTS_IN
laurence,"Morpheus",tt0133093,ACTS_IN
laurence,"Morpheus",tt0234215,ACTS_IN
laurence,"Morpheus",tt0242653,ACTS_IN
carrieanne,"Trinity",tt0133093,ACTS_IN
carrieanne,"Trinity",tt0234215,ACTS_IN
carrieanne,"Trinity",tt0242653,ACTS_IN

The command will look like this:

neo4j-import --into path_to_target_directory --nodes movies.csv --nodes actors.csv --relationships roles.csv

See the Neo4j Manual for further examples.