Developer Guides Getting Started Getting Started Intro Video Series Intro to Graph What is a Graph Database? Concepts: RDBMS to Graph Concepts: NoSQL to Graph Getting Started Resources Neo4j Graph Platform Graph Platform Overview Neo4j Desktop Intro Neo4j Browser Intro… Read more →

Developer Guides

Want to Speak? Get $ back.

Building a Recommendation Engine with Cypher

This guide explains how to leverage connections in your data to gather insights and start recommending currently unrelated but relevant facts to your nodes in focus.
You should have a basic understanding of the property graph model. Having downloaded and installed the Neo4j server helps you code along with the examples.

This demo explains how to use a basic dataset of Actors acting in Movies to create recommendations with a graph database.

By following the relationships between the people and movies in a meaningful manner you can determine co-occurences, frequencies, and relevant nodes in the graph. This is the basis for many recommendation engines.

Recommendation for Collaboration of Actors

You can follow along by installing and starting Neo4j server, opening it on http://localhost:7474 then inserting the movie dataset via the :play movie graph command. Go to the second slide, click the query, and run it.

In this guide we will load the dataset using LOAD CSV

WITH line
WHERE line.job = "ACTED_IN"
MERGE (m:Movie {title:line.title}) ON CREATE SET m.released = toInt(line.released), m.tagline = line.tagline
MERGE (p:Person {}) ON CREATE SET p.born = toInt(line.born)
MERGE (p)-[:ACTED_IN {roles:split(line.roles,";")}]->(m)
RETURN count(*);

Basic Queries

You should be able to run a query like this to find a single actor like Tom Hanks.

MATCH (tom:Person {name:"Tom Hanks"})

Similarly, you should be able to retrieve all his movies with a single query. Your results should already look like a graph.

MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(movie:Movie)
RETURN tom, movie

Of course Tom has colleagues who acted with him in his movies. This “co-actor” statement looks like this:

MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActor:Person)

Collaborative Filtering

We can now turn this “co-actor” query into a recommendation query by following those relationships another step out to find the “co-co-actors”, i.e. the second-degree actors in Tom’s network with whom he has not acted.

MATCH (tom:Person)-[:ACTED_IN]->(movie1)<-[:ACTED_IN]-(coActor:Person),
WHERE = "Tom Hanks"
AND   NOT    (tom)-[:ACTED_IN]->(movie2)

For a collaborative filtering, you often take frequencies of occurrences into account to find the people or things that appear most frequently in your network. Also you’d probably want to state that Tom Hanks never worked with the recommended people, not just not in movies that his co-actors worked in.

MATCH (tom:Person)-[:ACTED_IN]->(movie1)<-[:ACTED_IN]-(coActor:Person),
WHERE = "Tom Hanks"
AND   NOT    (tom)-[:ACTED_IN]->()<-[:ACTED_IN]-(coCoActor)
RETURN, count(distinct coCoActor) as frequency
ORDER BY frequency DESC

One of those “co-co-actors” is Tom Cruise. Now let’s see which movies and actors are between the two Toms.

Connection Paths

MATCH (tom:Person)-[:ACTED_IN]->(movie1)<-[:ACTED_IN]-(coActor:Person),
WHERE = "Tom Hanks" and = "Tom Cruise"
AND   NOT    (tom)-[:ACTED_IN]->(movie2)
RETURN tom, movie1, coActor, movie2, cruise

This returns multiple Bacon-Paths, even with Kevin Bacon himself.

So with two simple Cypher statements we created already two recommendation algorithms: Who you should know and How you get to know them.

You can also watch the video of Andreas running these queries live on our main example movie dataset.

Restaurant Recommendation

Imagine a graph like this: a few friends with their favorite restaurants, their cuisines and locations.

restaurant recommendation

A practical question to answer here, formulated in Graph Search slang:

Sushi restaurants in New York that my friends like

How could we translate that into the appropriate Cypher statement?

MATCH (person:Person)-[:IS_FRIEND_OF]->(friend),

WHERE = 'Philip'
  AND loc.location = 'New York'
  AND type.cuisine = 'Sushi'

RETURN, count(*) AS occurrence
ORDER BY occurrence DESC

Other factors that can be easily integrated in this query are favorites, allergies, ratings and closeness to my current position.