Importing data to Neo4j the spreadsheet way in Neo4j 2.0!
Hi all graphistas out there,And happy new year! I hope you had an excellent start, let’s keep this year rocking with a spirit of graph-love! Our
Rik Van Bruggen
did a lovely
blog post on how to import data to Neo4j using spreadsheets
in March last year. Simple and easy to understand but only for Neo4j version 1.9.3. Now it’s a new year and in December we launched a shiny new version of Neo4j, the 2.0.0 release! Baadadadaam! So, I thought better provide an update to his
blogpost,
with the spirit of his work. (Thank you Rik!)
You can still use the Neo4j CSV batch-importer (
Now for 2.0.0
) from
Michael Hunger
, or look at other
Data Import Options
.
If you simply want to use Cypher, Rik’s way is much easier. That’s why I have updated Riks Cypher statements old statements in a new spreadsheet that shows how how to import to Neo4j 2.0.0.
How does it work?
Open the spreadsheet.
The sheet is composed of two parts:
-
columnsA, B
and
C
: these contain the data for the
Nodes
of our graph, using a custom “
id
”, a “
name
”, and a “
gender
” as properties.
-
columnsF, G
and
H
: these contain the data for the
Relationships
of our graph, having a “
from-id
” (where the relationship starts), a “
to-id
” (where the relationship ends), and a “
relationship type
”. Columns
F
and
G
reference the nodes and their id’s in column
A
.
And then comes the secret sauce: how to createCypher
statements from this nodes and relationship information.
For this we use very simple statements that leverage the columns mentioned above, the
cypher syntax
and string concatenation. Look at the columns
D
and
I
:
Nodes
We just use this formula to create the cypher statement.
=”MERGE (meetup:Event {id:’”&A3&”‘, name:’”&B3&”‘})”
(instead of create we will use merge who is a new feature in 2.0.0 it will create if the node not exist otherwise it will not create a new node. You can read more about it
here in the Neo4j Manual
.
Output for row 3:
MERGE (meetup:Event{id:’153602002′, name:’Meetup Malmö’})
If we check the next row, we will see a change, since we know that all attendees of the meetup will attend our meetup, we can create the whole relationship too. So we combine the creation of the “Person” Node with connecting it to the meetup node we just created.
=”MERGE (_”&A4&”:Person {id:’”&A4&”‘, name:’”&B4&”‘, gender:’”&C4&”‘})
-[:ATTENDS]->(meetup)”
Output for row 4:
MERGE (_2:Person {id:’2′, name:’Donald Duck’, gender:’man’})-[:ATTENDS]->(meetup)
As you can see, it takes that
id
,
name
and
gender
properties from columns
A, B
and
C
, and puts these into a “MERGE” Cypher statement.
Relationships
Originally Rik used the (now legacy) Neo4j-AutoIndex to look up nodes to connect. We can use a schema index and do the same with MATCH.
In this particilar dataset we don’t have to create a index from our labels and the nodes properties, but since I can do it I will show you.
create index on :Person(id)
When you create a index you use the labels and the property in the node that you want to index.
Time to create some more relationships, let’s look at the Cypher statements to create them.
=”WITH 1 as dummy MATCH (p1:Person {id:’”&E4&”‘}), (p2:Person {id:’”&F4&”‘})
MERGE (p1)-[:”&G4&”]->(p2);”
The reason why we are using WITH 1 as dummy is that
it’s for the single statement for the neo4j-browser where all the match merge follow each other with no separation in a single big query.
Output for row 2:
MATCH (p1:Person {id:’2′}), (p2:Person {id:’5′}) MERGE (p1)-[:WORKS_WITH]->(p2);
This one is a little bit more complicated, as it uses Neo4j’s MATCH statement in order to create the relationship. We first have to look up start node and end node using the “id” property. And then the merge-statement creates the relationship based on therelationship-type
in column
G
.
Then we copy each of the formulas down across all the rows we want to cover.
Having done this, we end up with two columns each containing a number of cypher statements. So then what?
The Instructions Sheet
In the first sheet of the spreadsheet, you will find a bunch of instructions. Basically, you need to go through the following steps:
-
Download
and unzip Neo4j server.
-
Copy/paste the Cypher statements from the top part the Import Sheet into a text file or the browser window directly.All these statements form a
single large Cypher statement
as the browser can currently only execute single cypher statements.
-
Drag the file into the browser input area and then execute it
-
If you want to use the Neo4j-Shell for importing larger amounts of data use the approach shown in the second tab titled: “For the shell”.It uses one cypher statement (terminated with semi-colons) per line
and a begin / commit block around the statements to speed up the import with a single transaction.
-
Paste all statements into a file and usebin/neo4j-shell -file import.txt
or copy and paste direct in the browser
And there we go: the dataset gets created, and Neo4j is ready for use. I hope this little overview was useful for you – it sure was useful for me when getting my hands dirty for the first time 🙂 …
Note: Make sure you have the newest java running on your device.
You can download it here.
(I did that mistake)
Time to DIY! Good luck!
Cheers,
Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.
Download My Ebook