Importing data to Neo4j the spreadsheet way in Neo4j 2.0!

Hi all graphistas out there,And happy new year! I hope you had an excellent start, let’s keep this year rocking with a spirit of graph-love! Our   Rik Van Bruggen   did a lovely   blog post on how to import data to Neo4j using spreadsheets   in March last year.  Simple and easy to understand but only for Neo4j version 1.9.3. Now  it’s a new year and in December we launched a shiny new version of Neo4j, the 2.0.0 release! Baadadadaam! So, I thought better provide an update to his   blogpost,   with the spirit of his work. (Thank you Rik!)
You can still use the Neo4j CSV batch-importer (  Now for 2.0.0   ) from   Michael Hunger   , or look at other   Data Import Options   .
If you simply want to use Cypher, Rik’s way is much easier. That’s why I have updated Riks Cypher statements old statements in a new spreadsheet that shows how how to import to Neo4j 2.0.0.

How does it work?

  Open the spreadsheet.  
The sheet is composed of two parts:
  • columnsA, B and C : these contain the data for the Nodes of our graph, using a custom “ id ”, a “ name ”, and a “ gender ” as properties.
  • columnsF, G and H : these contain the data for the Relationships of our graph, having a “ from-id ” (where the relationship starts), a “ to-id ” (where the relationship ends), and a “ relationship type ”. Columns F and G reference the nodes and their id’s in column A .
And then comes the secret sauce: how to createCypher statements from this nodes and relationship information.
For this we use very simple statements that leverage the columns mentioned above, the  cypher syntax   and string concatenation. Look at the columns D and I :


We just use this formula to create the cypher statement.
=”MERGE (meetup:Event {id:’”&A3&”‘, name:’”&B3&”‘})”
(instead of create we will use merge who is a new feature in 2.0.0 it will create if the node not exist otherwise it will not create a new node. You can read more about it  here in the Neo4j Manual   .

Output for row 3:

MERGE (meetup:Event{id:’153602002′, name:’Meetup Malmö’})
If we check the next row, we will see a change, since we know that all attendees of the meetup will attend our meetup, we can create the whole relationship too. So we combine the creation of the “Person” Node with connecting it to the meetup node we just created.
=”MERGE (_”&A4&”:Person {id:’”&A4&”‘, name:’”&B4&”‘, gender:’”&C4&”‘})

Output for row 4:

MERGE (_2:Person {id:’2′, name:’Donald Duck’, gender:’man’})-[:ATTENDS]->(meetup)
As you can see, it takes that id , name and gender properties from columns A, B and C , and puts these into a “MERGE” Cypher statement.


Originally Rik used the (now legacy) Neo4j-AutoIndex to look up nodes to connect. We can use a schema index and do the same with MATCH.
In this particilar dataset we don’t have to create a index from our labels and the nodes properties, but since I can do it I will show you.
create index on :Person(id)
When you create a index you use the labels and the property in the node that you want to index.
Time to create some more relationships, let’s look at the Cypher statements to create them.
=”WITH 1 as dummy MATCH (p1:Person {id:’”&E4&”‘}), (p2:Person {id:’”&F4&”‘})
MERGE (p1)-[:”&G4&”]->(p2);”
The reason why we are using WITH 1 as dummy is that it’s for the single statement for the neo4j-browser where all the match merge follow each other with no separation in a single big query.  

Output for row 2:

MATCH (p1:Person {id:’2′}), (p2:Person {id:’5′}) MERGE (p1)-[:WORKS_WITH]->(p2);
This one is a little bit more complicated, as it uses Neo4j’s MATCH statement in order to create the relationship. We first have to look up start node and end node using the “id” property. And then the merge-statement creates the relationship based on therelationship-type in column G .
Then we copy each of the formulas down across all the rows we want to cover.
Having done this, we end up with two columns each containing a number of cypher statements. So then what?

The Instructions Sheet

In the first sheet of the spreadsheet, you will find a bunch of instructions. Basically, you need to go through the following steps:
  •   Download   and unzip Neo4j server.
  • Copy/paste the Cypher statements from the top part the Import Sheet into a text file or the browser window directly.All these statements form a single large Cypher statement as the browser can currently only execute single cypher statements.
  • Drag the file into the browser input area and then execute it
  • If you want to use the Neo4j-Shell for importing larger amounts of data use the approach shown in the second tab titled: “For the shell”.It uses one cypher statement (terminated with semi-colons) per line and a begin / commit block around the statements to speed up the import with a single transaction.
  • Paste all statements into a file and usebin/neo4j-shell -file import.txt or copy and paste direct in the browser
And there we go: the dataset gets created, and Neo4j is ready for use. I hope this little overview was useful for you – it sure was useful for me when getting my hands dirty for the first time 🙂 …
Note: Make sure you have the newest java running on your device.  You can download it here.  
(I did that mistake)
Time to DIY! Good luck!
Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today. Download My Ebook



Madhu says:

Hi, the part about executing the text file is not clear to me. Suppose if I have a text file containing Cypher queries, how can I execute them? (I certainly don’t want to the file copy/paste to the web server) Please answer for both windows and unix.

Julian Simpson says:

Madhu, something like neo4j-shell < foo.cypher or type foo.cyp | Neo4jShell will help.

Sharon says:

This method does not work for Windows. When you say “Copy/paste the Cypher statements from the top part the Import Sheet into a text file or the browser window directly. All these statements form a single large Cypher statement as the browser can currently only execute single cypher statements. ” it produces errors. The reality is that this tool is quite useless for Windows. If someone could actually create instructions specific to Windows, I might be inclined to give it a second chance.

Kenny Bastani says:

Sharon, thanks for your comment. The best way to import data on Windows is using Cypher’s LOAD CSV command. Please take a look at the reference documentation for Neo4j that outlines the steps for using LOAD CSV to populate your database from a CSV export of an Excel spreadsheet.

If you have any issues or need further support please submit a question on with the tag Neo4j or reply here and I will do my best to assist.

Thomas Johnson says:

I know you think using cypher text is da bomb, but wouldn’t it make Neo4j more marketable (and reach a bigger audience) if you created an excel macro that takes basic cell information from appropriately titled columns and does all the cypher construction behind the scenes? You would have an easy to fill table that any non-cypher speaking lunkhead (like me) could readily use to translate basic info into the Neo4j graphical representation. They could then use easier cypher queries to analyze the data.

Kenny Bastani says:

Hey Thomas,

That’s an interesting thought. We’re always looking at ways to expand the expressivity of the Cypher query language, but too we look for community contributions that provide new means to query data from Neo4j in ways that suit all kinds of use cases.



Jack Chang says:

well the previous one did not show up. I type it again.
I agree with Thomas Johnson about a simple tool (Excel–>Graph).
Node-link model is more useful than most think.
The universality…did you see?

JBG says:


Thanks a lot for this. I’m new to Cypher and have a few questions:

-What’s the point of “(_2:Person”? Why the underscore? What’s the number for? (You’re already giving ids in the statement)
-When you run this: “-[:ATTENDS]->(meetup)”” doesn’t it create a link between the node and *all* other nodes labelled with ‘meetup’?

In other words, I’m a bit confused about the purpose of using specific strings before the semi-colon, and unfortunately the Neo4j documentation doesn’t help in that regard. It uses “n” throughout before colons, so I always assume that’s a compulsory part of Cypher statements…


Leave a Reply

Your email address will not be published. Required fields are marked *