Inserting data into Neo4j with Neo4j-Shell and Cypher



Alireza Rezaei Mahdiraji
I am Alireza Rezaei Mahdiraji and I am a PhD student. My field or research is database systems.
I am experimenting several databases to support large scale scientific and simulation data. Some of the datasets have an inherent graph structure which make graph databases a good choice for modeling and querying such data.
I picked Neo4j for my modeling tasks because it is an important open source graph database which draw a lot of attentions from database research community and companies.        

Running a Cypher CREATE command in the Neo4j Shell with a large graph ends to the following error: “argument list too long”. So, how do we execute such large CREATE statement?

The solution is use Neo4j Shell transaction facility and break down the original CREATE command into several smaller CREATE command and write the result in a file. An excerpt of the output looks like as follows:

begin
CREATE
(m{n:’m’, d:’3′}),
(f0{n:’f0′, d:’2′}),
m-[:so]->f0
    …;
commit
exit
begin
(v20825{n:’v20825′, d:’0′}),
(e102800{n:’e102800′, d:’1′}),
e102800-[:so]->v20624,
    …;
commit
exit
begin
(v20825{n:’v20825′, d:’0′}),
(e102800{n:’e102800′, d:’1′}),
e102800-[:so]->v20624,
    …;
commit
exit
(e198203{n:’e198203′, d:’1′}),
e198203-[:so]->v40000,
e198203-[:so]->v39800,
f39600-[:so]->e198203
    …
commit
exit

For a file of 64M, I commit after each 500 node/relationship commands and it works just fine. I tried it with 1000 and I got the same error as above.       

After creating the file with CREATE command like example above, stop Neo4j server and import using the following command:

rm -rf data/graph.db && cat path/to/create_statment.cql | ./path/to/neo4j-shell -path data/graph.db  

In the command above, create_statment.cql is the name of the file with the CREATE command, graph.db is the folder which Neo4j uses to store the database informations (usually located at /path/to/neo4j-community-XX/data/). The first command just remove existing graph database files.

Now, start the Neo4j server and simply query neo4j to see the data:

START r=node(*) RETURN count(r);
START r=rel(*) RETURN count(r);


To sum up – if you are running long Cypher CREATE commands, be sure to break down the statements to smaller chunks wrapped inside transaction clause to be executable from neo4j-shell. 

/Alireza  



Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.

Download My Ebook