Graph Database

Recap: Intro to Graph Databases | Webinar Series #1

July 15, 2011

5 min read

Recap: Intro to Graph Databases | Webinar Series #1

Thanks to all of those who attended our Intro to Graph Databases on Wednesday, July 13. We had a great turnout and LOADS of fantastic questions!

Below are answers to all questions posed during the webinar. For any other questions, be sure to refer to our user list. We have an incredible community that answers your questions ridiculously fast, no matter what time zone.

Can keys/values be first class objects? -@PatrickDurusau

No, keys are just strings, and values can be primitives and strings and arrays of those. You can easily add methods that convert arbitrary objects into these primitives or graph structures at your domain level. Support for JSON types is planned.

What is the best Clojure library for Neo4J? Some of them seem quite old, is that because this is a “solved problem” or nobody cares…. -DJ

see here: https://stackoverflow.com/questions/5680976/is-neo4j-a-good-fit-for-clojure

How does CAP theorem apply to neo4j? -MK

Neo4j is not partition tolerant. Automatic, domain agnostic graph sharding is still a problem that we have to solve to scale out. (See also the thesis of Alex Averbuch: https://alexaverbuch.blogspot.com/2010/04/me-my-names-alex-im-currently.html)

Is Facebook using neo4j for their social graph? -Anonymous

No, but they should 🙂

Can there be multiple relationships between nodes? -Anonymous

Yes as many as you’d like. For most domains one relationshhip is enough as it can be traversed in both directions.

So, can Neo both know and love Trinity? -Anonymous

Yes.

What does the 4j in neo4j stand for? -Anonymous

“For Java”. As Neo4j is written in Java and provides a native API for the JVM. Other languages can access Neo4j Databases via a RESTful server protocol. There are lots of language bindings for the Neo4j graph database (see: https://wiki.neo4j.org/content/Main_Page#Language_and_framework_bindings)

Shouldn’t it be “At depth %d => %sn” in the example? ‘np’ just a type in the slides, I guess. -DM

That would have only worked on unix, since “n” is the char code for linefeed, windows uses carriage return and linefeed for line endings. The format string “%n” will expand to the appropriate line ending for the current platform. Check out https://download.oracle.com/javase/6/docs/api/java/util/Formatter.html#syntax in the table called “Conversions” for more details.

Is there a performance penalty if there is a large number of nodes linked to a single node or does this not matter? -DM

The number of connections of a node (nodes with many (> 100 000) connections are sometimes called supernodes) matters. Mostly when loading them initially in the cache (cold caches). The Neo4j team is currently working on improving that aspect.

What if Trinity was in love with Morpheus? Will Trinity’s node be returned because she only has the outgoing “LOVES” relationship? There was no check if the node the relationship points to is the start node. -BP

Yes. Any node in the traversed graph (reachable through KNOWS relationships from the start node) that has an outgoing LOVES relationship would be returned.

If possible, can you please take one simple example and explain the difference in representation between connected database and Graph database. -SR??

What is a connected databases?

Is the Cypher query language implemented in the programming language like LINQ or is it String-based? -US

Cypher is implemented in Scala using the parser-combinator library of scala. This parser is string based, but renders to a object based expression tree which is then evaluated using graph matching and several filtering and aggregation steps.

Are there other Traversers that return a set of Relations rather than a set of Nodes? -US

Yes, by using our more advanced traversal framework. See our documentation about this traversal framework and its JavaDoc API documentation.

Seems you have reinvented the hierarchical data model that was used in the late 60’s, 70’s and early 80’s and was then replaced by the relational model. –AG

Good point. Graph databases such as Neo4j have a lot in common with the navigational databases of old. Even back then the navigational databases often outperformed relational databases. There are thus two questions that you ought to ask: why did the relational databases take over the market? and what has changed since then?

The navigational databases back then were a pain to work with. They lacked good abstractions for working with the data, and required trained specialists to work with. Relational databases took over the market because of their structured query language, this meant that any developer could use relational databases.

A lot has changed since that time. The biggest game changer has been the advent of object oriented programming. Objects gives us the abstraction we need for making graph databases accessible as a pattern to work with for anyone. We are of course working on making Neo4j even more user friendly, with efforts such as Cypher. Having the foundation of a clear object model for the graph makes such efforts possible, and make navigational databases interesting again.

Is the traversal asynchronous? -NV

Right now traversals are synchronous. And so far they were fast enough. But we’ve been thinking about providing parallel traversals.

How do I get the node that I want to work with from the Graph DB ? I mean how do I get i.e. Mr Andersson? -RE

You can look up the start nodes for your traversal using the integrated indexing framework (e.g. by name) or you can create “category” nodes that aggregate nodes of a certain type, tag or category and link the category nodes to the root node and start your traversal from there.

How can you deal with versions of graphs? Or, just versions of key-value properties, for example, if you wanted to keep the date a relationship or a specific property on a node changed while keep the old value (the history, or last version)?

There are several ways of dealing with that, either by cloning parts of the graph or by putting those relationships into array based properties. One of our engineers created a proof of concept for that (https://github.com/dmontag/neo4j-versioning). You could also look into the theory of clojure’s persistent data structures for creating such a versioning approach not just for properties but for relationships or whole subgraphs.

Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.

Download My Ebook