Excerpt from “Graph databases are hot, but can they break relational’s grip” published in Silicon Angle by Paul Gillin

Comcast Corp. is working on ways to better understand its customers’ families. The company plans to roll out features that enable parents to manage the devices their children use at a fine level of granularity. For example, account holders will be able to pause internet access at dinner time or know precisely when their kids are online, said Mark Hashimoto, director of engineering for the internet of things in Comcast’s Silicon Valley Innovation Center.

But creating profiles that combine that level of detailed information with the flexibility to accommodate constant change isn’t simple. "If you want to put parental controls on what your children can do with their iPads, we first have to know that you have children, and then which iPad belongs to which child,” Hashimoto said. If multiple account holders are involved, things get even trickier. “Maybe the wife wants notifications in Spanish and the husband wants them in English. We also can’t predict what the customer will want in two years."

Comcast evaluated a variety of relational, NoSQL and graph databases, looking for one that could closely mimic the relationships that people have and how they view the world. "When we saw what we could do with graph, there was no looking back," he said. Comcast chose Neo4j Inc.’s namesake graph database as its profile engine. "We found it to be an intuitive way to model relationships among people," Hashimoto said.

Momentum building

Graph databases are suddenly hot. Amazon Web Services Inc.’s announcement this week of Neptune, a graph database in the cloud, is the latest in a series of recent indications that this once-niche technology is edging toward the mainstream of enterprise information technology.

In September, startup TigerGraph Inc. released a high-speed native parallel graph database platform after raising $31 million in a series A funding round. At about the same time, enterprise software vendor Callidus Software Inc. acquired OrientDB Ltd., creator of an open-source NoSQL database that supports graph and other models. In October, Neo4j overhauled its flagship product with features aimed at making graphs more accessible to business users. And early this year, Microsoft released the fruits of a four-year-long graph database development project to open source.

Graph databases are finding favor for their unique ability to represent complex relationships that rapidly navigate between elements in the database to discover correlations. Forrester Research Inc. analyst Noel Yuhanna said they’re perfect for answering questions like "How many school friends who are not yet connected to me live in Europe and are already connected to five of my closest friends?" or "Is there a dentist in my area whom at least one of my friends visits?"

Answering those questions with relational tables requires performing multiple joins, each of which consumes more memory as intermediate joins are created. “As you get four, five, six hops into the query, the sets become far too large” and performance tanks, said Jim Webber, chief scientist at Neo4j Inc., whose seven-year-old product is considered the current market leader.

For applications with large amounts of uniform data and densely populated tables, relational databases perform well and are thoroughly understood, he said. However, “Most of the applications I’ve come across could have been better done in a graph.”

Read the full article →