Easy import with ETL features directly in Cypher
Graphs are everywhere, but sometimes they’re buried in other systems and legacy databases. You need to extract the data then bring it into Neo4j to experience its true graph form. To help you do this, we’ve brought bulk load functionality directly into Cypher. The new LOAD CSV clause makes that a pleasant and simple task, optimized for graphs around millions scale – the kind of size that folks typically encounter when getting started with Neo4j. To illustrate, consider this small set of fictional Twitter users and their followers:user | follower |
Charlie Sheen | Morgan Freeman |
Charlie Sheen | Oliver Stone |
Oliver Stone | Charlie Sheen |
Michael Douglas | Oliver Stone |
Michael Douglas | Morgan Freeman |
Martin Sheen | Oliver Stone |
Martin Sheen | Morgan Freeman |
Martin Sheen | Charlie Sheen |
Morgan Freeman | Charlie Sheen |
user,follower Charlie Sheen,Morgan Freeman Charlie Sheen,Oliver Stone Oliver Stone,Charlie Sheen Michael Douglas,Oliver Stone Michael Douglas,Morgan Freeman Martin Sheen,Oliver Stone Martin Sheen,Morgan Freeman Martin Sheen,Charlie Sheen Morgan Freeman,Charlie Sheen(note that the LOAD CSV separator is strictly a comma, not comma and whitespace!) The CSV file can then in turn be loaded into the Neo4j graph like so:
LOAD CSV WITH HEADERS FROM "file:./Twitter.csv" AS csvLine MERGE (u:Person { name: csvLine.user }) MERGE (f:Person { name: csvLine.follower }) MERGE (u)<-[:FOLLOWS]-(f);That is, you simply point LOAD CSV to a file, then pair it with an update statement (like CREATE or MERGE). Each row of the file will be applied to the statement sequentially, available as a map. Which in turn creates a graph that looks like:

Dense nodes support
Neo4j 2.1 brings together lots of great improvements into one package, and of particular interest are optimizations we’ve made around dense nodes. A dense node can occur in any domain, but it’s easily reasoned about when you think about social graphs. For example, Britney Spears may have many millions of incoming FAN relationships, but relatively few FRIEND relationships, and fewer still familial relationships like MOTHER or COUSIN. This release marks the start of our dense nodes management features and provides a transparent performance boost when accessing those relatively fewer relationships amongst the general mass of relationships by separating them out (by relationship name and direction) in the database. Now when you want to surgically pick out Britney’s friends and family, you can do so without having to sift through her fans too.New Cypher functionality and experimental query planner
Cypher has become the primary interface for much of the work that we do in the graph. The Cypher team has been tremendously productive during this release period both adding new user-facing features (like LOAD CSV that we saw above) and internals. New to Cypher in Neo4j 2.1 is the UNWIND function, which converts collections into row data as exemplified by Mark Needham in this posting. Under the covers, things are even more interesting. There’s a new experimental Cypher optimizer that improves performance of some queries. This is invoked by specifying “CYPHER 2.1.EXPERIMENTAL” at the start of your Cypher query. For some queries this can provide a substantial boost in performance while for others it might not, so measure your performance if you’re going to use it. Use this with care, as some queries may run more slowly with the experimental optimizer, but please give us your feedback if you try it out!Other goodies
Other notable improvements included in Neo4j 2.1:- A new lock manager in Neo4j Enterprise Edition, that improves performance in many-core computers
- Official support for OpenJDK 7, adding to the ongoing support for Oracle Java 7
Available Now!
Let the fun begin:- Download Neo4j 2.1
- But first, check out the upgrade guide if moving from Neo4j 1.9
- Join the conversation to let us know your experiences
About the Author
Andreas Kollegger , Senior Product Designer

Andreas Kollegger is a technological humanist. Starting at NASA, Andreas designed systems from scratch to support science missions. Then in Zambia, he built medical informatics systems that complement technology with social good. Now with Neo4j, he is democratizing graph databases which elevate understanding by emphasizing relationships.
Andreas joined Neo4j as an early member of core engineering. He has now taken on the role of Senior Product Designer, crafting a developer experience that balances simplicity with power.
4 Comments
[…] 详细介绍和更多新特性说明请看官方发行说明。 […]
Like
[…] Neo4j 2.1 Released – Introduces .csv file importing, an experimental query planner and support for OpenJDK 7. […]
Can you explain
why we have only one “link” between Charlie Sheen and Olivier Stone
the direction of arrows to martin sheen
In this example a social network like Twitter allows you to follow and be followed. In Neo4j, you can have more than one relationship between two nodes. If two users were to follow each other, this would be two different relationships directed each way between the two nodes. In a social network like facebook where you have friend relationships, a single relationship either direction is sufficient since in Cypher you can query bi-directionally using (user1:User)-[:FOLLOWS]-(user2:User).
[…] building up the Neo4j World Cup Graph I’ve been making use of the LOAD CSV function and I frequently found myself needing to do different things depending on the value in one […]
3 Trackbacks
Leave a Reply

Upcoming Event
Have a Graph Question?
Reach out and connect with the Neo4j staff.
Stack OverflowCommunity Forums
Contact Us
Share your Graph Story?
Email us: content@neo4j.com