Importing Data into Neo4j – the Spreadsheet Way

I’m sure that many of you are very technical people – very knowledgeable about all things Java, Dr. Who and the like – but I in case you’ve ever met me, you would probably notice that I am not technical. And I don’t want to be.

I love technology, but have never had the talent, inclination nor education to program – so I don’t. But I still want to get data into Neo4j – so how do I do that?

There are many technical tools out there (definitely look here, here and here, but I needed something simpler.

So my friend and colleague Michael Hunger came to the rescue and offered me some help to create a spreadsheet to import data into Neo4j.

You will find the spreadsheet here, and you will find two components:
    1. An instruction sheet. I will get to that later.
    2. A data import sheet. Let’s look at that first.

The Data Import Sheet

This sheet is composed of two parts:
    • Columns A, B and C: These contain the data for the Nodes of our graph, using an “id”, a “name”, and a “type
    • Columns F, G and H: These contain the data for the Relationships of our graph, having a “from-id” (where the relationship starts), a “to-id” (where the relationship ends), and a “relationship type”. Columns F and G reference the nodes and their id’s in column A.

And then comes the secret sauce: How to create Cypher statements from these nodes and relationships.

For this we use very simple statements that leverage the columns mentioned above, the Cypher syntax and string concatenation. Look at the columns D and I:

    • Cypher statements to create the nodes:
create n={id:'"&A2&"', name:'"&B2&"', type:'"&C2&"'};

output for row 2:

create n={id:'1', name:'Amada Emory', type:'Female'};

As you can see, it takes that id, name and type properties from columns A, B and C, and puts these into a “create” Cypher statement.

    • Cypher statements to create the relationships:
start  n1=node:node_auto_index(id='"&F2&"'),
         n2=node:node_auto_index(id='"&G2&"')  create n1-[:"&H2&"]->n2;

output for row 2:

start  n1=node:node_auto_index(id='1'),
       n2=node:node_auto_index(id='11') create n1-[:MOTHER_OF]->n2;

This one is a little bit more complicated, as it will be using Neo4j’s auto-index: In order to create the relationship, we first have to look up a start node and an end node from the auto-index using the ID property. And then the create statement creates the relationship based on the relationship-type in column H.

So with this, we end up with two columns containing a bunch of Cypher statements. So then what?

The Instructions Sheet

In the first sheet of the spreadsheet, you will find a bunch of instructions. Basically, you need to go through the following steps:
    • Download Neo4j.
    • Copy/paste the Cypher statements from the Import Sheet into a text file.
    • Wrap these with a Neo4j transaction (begin, commit) – so that all of the statements get persisted to disk in the same transaction (or not in case of an error). (This is not important for smaller datasets, but is much more important for larger datasets.)
    • Some instructions on how to enable auto-indexing on Neo4j. This is important, because as you insert data into the database, it needs to get indexed for setting up the relationships properly (see above) and for future use.
    • And some instructions on how you can pipe the text file into the Neo4j shell – if necessary. For small datasets (and therefore, a limited number of Cypher statements) you can do with copy/pasting the text file into the Web-UI console – but that might not always work.
    • Starting the server and browsing the Web-UI


And there we go: The dataset gets created, and Neo4j is ready for use.

I hope this little overview was useful for you – it sure was useful for me when getting my hands dirty for the first time 🙂 …

Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.