Graphs and Search: a bit of historyWeb search and graphs have a long history. Throughout most of the 1990s, the technology behind web search was based on “atomic data”: it indexed each page and ranked it in isolation, based solely on its contents, and without any reference to other pages. But in 1999, a small startup called Google adopted a new graph-centric approach, invented by co-founder Larry Page, called PageRank. PageRank changed the fundamentals of web search, and catapulted Google ahead of its competitors, who to this day have not caught up. What was novel about this new algorithm is that instead of ranking pages in isolation, without any reference to one another, it achieved markedly better results by taking into account how the pages are connected.
Connected Data as a New Source of InsightIn his keynote at last year’s GraphConnect Conference in San Francisco, social researcher James Fowler (author of the book “Connected”) shared his latest research findings, indicating how one can learn more about someone by knowing how they interact with the people and things around them, than by just learning discrete facts about that person. The difference between insights gained from atomic data, and the intelligence that can be discovered from connected data, is vast, and calls for specialized technologies that are designed to exploit connectedness.
How Does Graph Search Work?Graphs are inherently visual. It’s not so difficult to understand how the technology works, even if you’re not that technical. Let’s take one of Facebook’s example Graph Search queries, which is to find all of the Sushi restaurants in New York that my friends like. Below is an illustration of what the underlying graph looks like: The data stored inside of the graph database looks exactly like the drawing. Getting the answer is a very simple matter for a graph database. You just need to formulate the question in a way that the database understands. Those who are more technically inclined can see an example below for the query that answers the question: Sushi restaurants in New York that my friends like START me=node:person(name = ‘Philip’), location=node:location(location=’New York’), cuisine=node:cuisine(cuisine=’Sushi’) MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant) -[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine) RETURN restaurant
Cypher Query Language Example: Sushi restaurants in New York that my friends like
Other Applications for GraphsThinking in graphs is natural, and contagious. The more you think in terms of connections, the more you realize that graphs are the way that we implicitly think. What is a decision tree, for example, but a graph of possibilities? The more you look, the more you start to notice that graphs are, in fact, everywhere. Graph database users regularly use queries like the one above to answer questions, and the more you ask, the more you think of new questions that never occurred to you to ask previously. Graph queries can get quite elaborate, and it’s entirely possible to run queries that scan within a social network that is two, three, or more levels of friends apart. Opportunities for leveraging connected data extend far beyond social and search. The pattern that applies to Graph Search is also applicable to bioinformatics, fraud detection, network management, logistics, and a variety of other use cases. Neo Technology has customers in all these areas (and more!) using the Neo4j graph database to achieve new and higher levels of insight.
I’m not Facebook… How can I get this?Technology giants such as Facebook, Google, and Twitter have all built graph technologies from the ground up to differentiate and grow their business. Building and maintaining one’s own database management system however is not a practical solution if you’re not Facebook. The good news is that companies wanting functionality like Graph Search are a click away from getting the tools they need to build it. At its core, Graph Search is a database. Unlike a decade ago, one can now find commercial off-the-shelf graph databases that are proven and robust, and built from the ground up to support connected data. Neo4j is the most widely used graph database today. Companies have adopted it because it’s 1000 times faster than relational databases for working with connected data, and much easier to work with than by shoehorning graphs into tables. Neo4j is freely available as open source software, with a Community Edition available under same open source license as MySQL, and an Enterprise edition. Commercial subscriptions are available from Neo4j creator and sponsor Neo Technology. Commercial users include Cisco, Adobe, Deutsche Telekom, Accenture, and many more; as well as lots of startups, including FiftyThree (makers of Paper, winner of Apple’s 2012 iPad App of the Year), Seth Godin’s Squidoo, and Justdial (one of India’s most talked-about startups). As we move into an era where more and more companies are benefiting from understanding connected data, having the right tools available to anyone means that no one needs to get left behind. Neo4j is available for download today. Give it a try, or check out the interactive Cypher web console, to try out the Cypher graph query language immediately from your web browser. Emil Eifrem and Philip Rathle, co-authors Click on the image below to view the example query above in an online interactive Cypher console: