Developer Guides Getting Started Getting Started What is a Graph Database? Intro to Graph DBs Video Series Concepts: RDBMS to Graph Concepts: NoSQL to Graph Getting Started Resources Neo4j Graph Platform Graph Platform Overview Neo4j Desktop Intro Neo4j Browser Intro… Read more →
Tutorial: Build a Cypher Recommendation Engine
This guide shows how to use the relationships in your data to gather insights and recommend new entities that do not currently have a direct relationship based on the other relationships and network in the graph.
You should have a basic understanding of the property graph model. Having Neo4j Desktop downloaded and installed will allow you code along with the examples.
This tutorial explains how to use a basic dataset of Actors acting in Movies in a graph database to show recommendations for other actors to work with or similar movies to watch.
By following the meaningful relationships between the people and movies, you can determine occurences of actors working together, the frequency of actors working with one another, and the movies they have in common in the graph. This structure forms the basis for many recommendation engines.
Once you have a running instance of Neo4j, you populate the movie dataset by typing
:play movie graph into the Neo4j Browser command line and clicking the play button ().
Go to the second slide (using the right arrow) of the pane that appears below the command line, click the query, and run it.
You should see a result pane in Neo4j like the one below.
This data has been created and stored in the database so we can query it. The next section will show you how to write some queries to explore the data you just created.
Before we start recommending things, we need to find out what is interesting in our data to see what kinds of things we can and want to recommend. To start, let us run a query like this to find a single actor like Tom Hanks.
Now that we found an actor we are interested in, we can retrieve all his movies by starting from the
Tom Hanks node and following the
Your results should look like a graph.
Of course, Tom has colleagues who acted with him in his movies. A statement to find Tom’s co-actors looks like this:
We can now turn the co-actor query above into a recommendation query by following those relationships another step out to find the “co-co-actors”, i.e. the second-degree actors in Tom’s network. This will show us all the actors Tom may not have worked with yet, and we can specify a criteria to be sure he hasn’t directly acted with that person.
You probably noticed that a few names appear multiple times. This is because there are multiple paths to follow from Tom Hanks to these actors.
To see which co-co-actors appear most often in Tom’s network, we can take frequency of occurrences into account by counting the number of paths between Tom Hanks and each coCoActor and ordering them by highest to lowest value.
One of those “co-co-actors” is Tom Cruise. Now let’s see which movies and actors are between the two Toms so we can find out who can introduce them.
As you can see, this returns multiple paths. If you have ever played the six degrees of Kevin Bacon game, this concept of seeing how many hops exist between people is exactly what graphs depict. You will notice that our results even return a path with Kevin Bacon himself.
With these two simple Cypher statements, we already created two recommendation algorithms – who to meet/work with next and how to meet them.
You could apply the same ideas you learned here to many other uses for recommending products and services, finding restaurants or activities you might like, or connecting with other colleagues who share similar interests of skills. We will mention a few specifically here with resources you can use to find more information.
We have a graph of a few friends with their favorite restaurants, cuisines, and locations.
A practical question to answer here, formulated as a graph search, is:
How could we translate that into the appropriate Cypher statement?
Other factors that can be easily integrated in this query are favorites, allergies, ratings, and distance from my current position.