Importing data to Neo4j the spreadsheet way in Neo4j 2.0!


Importing data to Neo4j the spreadsheet way in Neo4j 2.0!

Hi all graphistas out there,And happy new year! I hope you had an excellent start, let’s keep this year rocking with a spirit of graph-love! Our   Rik Van Bruggen   did a lovely   blog post on how to import data to Neo4j using spreadsheets   in March last year.  Simple and easy to understand but only for Neo4j version 1.9.3. Now  it’s a new year and in December we launched a shiny new version of Neo4j, the 2.0.0 release! Baadadadaam! So, I thought better provide an update to his   blogpost,   with the spirit of his work. (Thank you Rik!)
 
You can still use the Neo4j CSV batch-importer (  Now for 2.0.0   ) from   Michael Hunger   , or look at other   Data Import Options   .
   
If you simply want to use Cypher, Rik’s way is much easier. That’s why I have updated Riks Cypher statements old statements in a new spreadsheet that shows how how to import to Neo4j 2.0.0.
 

How does it work?

  Open the spreadsheet.  
 
The sheet is composed of two parts:
 
  • columnsA, B and C : these contain the data for the Nodes of our graph, using a custom “ id ”, a “ name ”, and a “ gender ” as properties.
 
  • columnsF, G and H : these contain the data for the Relationships of our graph, having a “ from-id ” (where the relationship starts), a “ to-id ” (where the relationship ends), and a “ relationship type ”. Columns F and G reference the nodes and their id’s in column A .
 
And then comes the secret sauce: how to createCypher statements from this nodes and relationship information.
 
For this we use very simple statements that leverage the columns mentioned above, the  cypher syntax   and string concatenation. Look at the columns D and I :
   

Nodes

We just use this formula to create the cypher statement.
=”MERGE (meetup:Event {id:’”&A3&”‘, name:’”&B3&”‘})”
(instead of create we will use merge who is a new feature in 2.0.0 it will create if the node not exist otherwise it will not create a new node. You can read more about it  here in the Neo4j Manual   .

Output for row 3:

MERGE (meetup:Event{id:’153602002′, name:’Meetup Malmö’})
If we check the next row, we will see a change, since we know that all attendees of the meetup will attend our meetup, we can create the whole relationship too. So we combine the creation of the “Person” Node with connecting it to the meetup node we just created.
 
=”MERGE (_”&A4&”:Person {id:’”&A4&”‘, name:’”&B4&”‘, gender:’”&C4&”‘})
-[:ATTENDS]->(meetup)”
 

Output for row 4:

 
MERGE (_2:Person {id:’2′, name:’Donald Duck’, gender:’man’})-[:ATTENDS]->(meetup)
As you can see, it takes that id , name and gender properties from columns A, B and C , and puts these into a “MERGE” Cypher statement.

Relationships

Originally Rik used the (now legacy) Neo4j-AutoIndex to look up nodes to connect. We can use a schema index and do the same with MATCH.
In this particilar dataset we don’t have to create a index from our labels and the nodes properties, but since I can do it I will show you.
 
create index on :Person(id)
 
When you create a index you use the labels and the property in the node that you want to index.
 
Time to create some more relationships, let’s look at the Cypher statements to create them.
 
 
=”WITH 1 as dummy MATCH (p1:Person {id:’”&E4&”‘}), (p2:Person {id:’”&F4&”‘})
MERGE (p1)-[:”&G4&”]->(p2);”
The reason why we are using WITH 1 as dummy is that it’s for the single statement for the neo4j-browser where all the match merge follow each other with no separation in a single big query.  
 

Output for row 2:

MATCH (p1:Person {id:’2′}), (p2:Person {id:’5′}) MERGE (p1)-[:WORKS_WITH]->(p2);
 
This one is a little bit more complicated, as it uses Neo4j’s MATCH statement in order to create the relationship. We first have to look up start node and end node using the “id” property. And then the merge-statement creates the relationship based on therelationship-type in column G .
 
Then we copy each of the formulas down across all the rows we want to cover.
Having done this, we end up with two columns each containing a number of cypher statements. So then what?
   

The Instructions Sheet

In the first sheet of the spreadsheet, you will find a bunch of instructions. Basically, you need to go through the following steps:
  •   Download   and unzip Neo4j server.
  • Copy/paste the Cypher statements from the top part the Import Sheet into a text file or the browser window directly.All these statements form a single large Cypher statement as the browser can currently only execute single cypher statements.
  • Drag the file into the browser input area and then execute it
  • If you want to use the Neo4j-Shell for importing larger amounts of data use the approach shown in the second tab titled: “For the shell”.It uses one cypher statement (terminated with semi-colons) per line and a begin / commit block around the statements to speed up the import with a single transaction.
  • Paste all statements into a file and usebin/neo4j-shell -file import.txt or copy and paste direct in the browser
   
And there we go: the dataset gets created, and Neo4j is ready for use. I hope this little overview was useful for you – it sure was useful for me when getting my hands dirty for the first time 🙂 …
 
Note: Make sure you have the newest java running on your device.  You can download it here.  
 
(I did that mistake)
 
Time to DIY! Good luck!
 
Cheers,
Pernilla
Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today. Download My Ebook