The current database debate and graph databases

During the latest month or so there has been a lot of new energy in the database debate. People are questioning the RDBMS to an extent that we’re not used to. Tony Bain asks: Is the Relational Database Doomed?. He talks about the simplicity, robustness, flexibility, performance, scalability, and compatibility of databases, where the RDBMS has been “good enough” in all areas in most cases. He goes on to say:

Today, we are in a slightly different situation. For an increasing number of applications, one of these benefits is becoming more and more critical; and while still considered a niche, it is rapidly becoming mainstream, so much so that for an increasing number of database users this requirement is beginning to eclipse others in importance. That benefit is scalability.

He goes on to describe alternatives to the traditional RDBMS, foremost he gives an overview over key/value databases.

Robin Bloor commented on the article and among other things comes up with three flaws in the relational data model:

  1. it has problems representing common data structures like ordered lists, hierarchies, trees or web page content
  2. it isn’t helpful when the data model evolves over time
  3. there’s a lack in access semantics, being restricted to the semantics provided by storing items as rows in tables

What about the alternatives to RDBMS? Most of the buzz is around key/value stores and schema-less databases. Richard Jones wrote the article Anti-RDBMS: A list of distributed key-value stores, listing the most important projects. Going schema-less can look very different, and the solution FriendFeed uses may not be the most common way to implement this.

What’s then the most important difference between key/value stores and a graph database? Key/value stores lack built-in support for representing relationships between entities, and to capture such relationships is fundamental for data modeling. As Neo4j supports arbitrary key/value pairs on nodes and relationships it could as well be viewed as a key/value store with full built-in support for relationships.

Because of the advantages of using a graph database and the lack of products available in the market today, companies tend to build their own graph database as underpinning to highly scalable applications. Recently Scott Wheeler of Directed Edge wrote about their in-house database solution: On Building a Stupidly Fast Graph Database. In the associated Hacker News discussion he also mentioned that they tried Neo4j but were disappointed with the performance. This puzzles us since Neo4j is used in production with way larger amounts of data than what’s mentioned in the discussion.

In summary, there’s an increasing awareness that the 30+ year old relational model may not be optimal for a lot use cases that we as developers encounter today. It’s great to see that the community is seriously starting to tackle the challenges of efficiently handling information in the 21st century!

Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.

Download My Ebook



Anonymous says:

There is one novel and rather unusual approach where partially ordered sets are used instead of graphs: <A HREF="; REL="nofollow"></A&gt; What is attractive is that it is claimed to be joinless and integrated with programming (like LINQ?). Yet, it is not clear if it is something really new or simply a number of old

leandro says:

Ðis is ſtupid. People condemn ðe relational model for SQL limitations, but SQL is not relational at all. Much better ðan reinventing ðe wheel would be to implement <A HREF="http://thethirdmanifesto/&quot; REL="nofollow">truly relational databaſe management ſyſtems</A>.

Ben says:

"People are questioning the RDBMS to an extent that we’re not used to."<BR/><BR/>I commented on that article <A HREF="; REL="nofollow">on my blog</A>, so I won’t repeat my arguments here. After debating with a commenter, I’m inclined to think that this "new energy" is another round of inventors trying to

Richard says:

One of the other big issues is that a lot of database work ends up as follows…<BR/><BR/>1) Using JPA (or whatever), serialize and de-serialize a lot of object graphs. This happens a lot interactively.<BR/><BR/>2) Occasionally (less often than (1) but still fairly often), create broad-based sweeping queries against the data represented by those object graphs.<BR/><BR/>RDBMS systems can handle (

Phil Goetz says:

I was excited to hear about Neo4j. I was disappointed to see how it&#39;s implemented.<br /><br />It&#39;s a vast improvement over relational DBs. You should write examples for the SQL people who can&#39;t see that. For example, here&#39;s an SQL query in a program I work on:<br /><br />SELECT r.role_id FROM omnium..ident_names i, omnium..role_link r, omnium..asm_feature f, omnium..db_data d

@Phil Goetz<br /><br />Thanks for your input!<br /><br />Neo4j was designed to provide a simple, flexible and performant way to persist directed labeled graphs. Other representations can then easily be built on top of Neo4j and as different use cases have different needs, it wouldn&#39;t be a good design choice to try to put everything into the core Neo4j engine.<br /><br />At the moment Neo4j is

Puneet says:

While you are correct in your assesment that key value stores dont have any relationships between the key value pairs. But you should also mention that the main draw to key value stores is their distributability. I just started reading about Neo4j so I dont know how ready it is to run on data spread over 100s of different machines or to be run in an online environment.

At least some of the things that Phil Goetz misses in Neo4J are, as far as I can see, to be found in Topic Maps. Check out e.g. the implementation by Ontopia The query language that is being developed for Topc Maps is also based on Prolog.

Anonymous says:

im interested to hear if Scott Wheeler (of DirectedEdge) problems with Neo4j performance have been solved?

ªAnonymous: At the time Scott Wheeler tried out Neo4j, it didn&#39;t have a batch inserter, so inserting large datasets wasn&#39;t nearly as fast as it is now. I guess this solves at least part of their problem. As we never got any detailed information regarding the problem, it&#39;s hard to be more specific.

Leave a Reply

Your email address will not be published. Required fields are marked *