Goals This guide explains how to leverage connections in your data to gather insights and start recommending currently unrelated but relevant facts to your nodes in focus. Prerequisites You should have a basic understanding of the property graph model. Having… Learn More →

Goals
This guide explains how to leverage connections in your data to gather insights and start recommending currently unrelated but relevant facts to your nodes in focus.
Prerequisites
You should have a basic understanding of the property graph model. Having downloaded and installed the Neo4j server helps you code along with the examples.
Beginner


This demo explains how to use a basic dataset of Actors acting in Movies to create recommendations with a graph database.

By following the relationships between the people and movies in a meaningful manner you can determine co-occurences, frequencies, and relevant nodes in the graph. This is the basis for many recommendation engines.

Recommendation for Collaboration of Actors

You can follow along by installing and starting Neo4j server, opening it on http://localhost:7474 then inserting the movie dataset via the :play movie graph command. Go to the second slide, click the query, and run it.

In this guide we will load the dataset using LOAD CSV

LOAD CSV WITH HEADERS FROM "https://neo4j-contrib.github.io/developer-resources/cypher/movies_actors.csv" AS line
WITH line
WHERE line.job = "ACTED_IN"
MERGE (m:Movie {title:line.title}) ON CREATE SET m.released = toInt(line.released), m.tagline = line.tagline
MERGE (p:Person {name:line.name}) ON CREATE SET p.born = toInt(line.born)
MERGE (p)-[:ACTED_IN {roles:split(line.roles,";")}]->(m)
RETURN count(*);

Basic Queries

You should be able to run a query like this to find a single actor like Tom Hanks.

MATCH (tom:Person {name:"Tom Hanks"})
RETURN tom

Similarly, you should be able to retrieve all his movies with a single query. Your results should already look like a graph.

MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(movie:Movie)
RETURN tom, movie

Of course Tom has colleagues who acted with him in his movies. This “co-actor” statement looks like this:

MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActor:Person)
RETURN coActor.name

Collaborative Filtering

We can now turn this “co-actor” query into a recommendation query by following those relationships another step out to find the “co-co-actors”, i.e. the second-degree actors in Tom’s network with whom he has not acted.

MATCH (tom:Person)-[:ACTED_IN]->(movie1)<-[:ACTED_IN]-(coActor:Person),
         (coActor)-[:ACTED_IN]->(movie2)<-[:ACTED_IN]-(coCoActor:Person)
WHERE tom.name = "Tom Hanks"
AND   NOT    (tom)-[:ACTED_IN]->(movie2)
RETURN coCoActor.name

For a collaborative filtering, you often take frequencies of occurrences into account to find the people or things that appear most frequently in your network. Also you’d probably want to state that Tom Hanks never worked with the recommended people, not just not in movies that his co-actors worked in.

MATCH (tom:Person)-[:ACTED_IN]->(movie1)<-[:ACTED_IN]-(coActor:Person),
         (coActor)-[:ACTED_IN]->(movie2)<-[:ACTED_IN]-(coCoActor:Person)
WHERE tom.name = "Tom Hanks"
AND   NOT    (tom)-[:ACTED_IN]->()<-[:ACTED_IN]-(coCoActor)
RETURN coCoActor.name, count(distinct coCoActor) as frequency
ORDER BY frequency DESC
LIMIT 5

One of those “co-co-actors” is Tom Cruise. Now let’s see which movies and actors are between the two Toms.

Connection Paths

MATCH (tom:Person)-[:ACTED_IN]->(movie1)<-[:ACTED_IN]-(coActor:Person),
         (coActor)-[:ACTED_IN]->(movie2)<-[:ACTED_IN]-(cruise:Person)
WHERE tom.name = "Tom Hanks" and cruise.name = "Tom Cruise"
AND   NOT    (tom)-[:ACTED_IN]->(movie2)
RETURN tom, movie1, coActor, movie2, cruise

This returns multiple Bacon-Paths, even with Kevin Bacon himself.

So with two simple Cypher statements we created already two recommendation algorithms: Who you should know and How you get to know them.

You can also watch the video of Andreas running these queries live on our main example movie dataset.

Restaurant Recommendation

Imagine a graph like this: a few friends with their favorite restaurants, their cuisines and locations.

restaurant recommendation

A practical question to answer here, formulated in Graph Search slang:

Sushi restaurants in New York that my friends like

How could we translate that into the appropriate Cypher statement?

MATCH (person:Person)-[:IS_FRIEND_OF]->(friend),
      (friend)-[:LIKES]->(restaurant:Restaurant),
      (restaurant)-[:LOCATED_IN]->(loc:Location),
      (restaurant)-[:SERVES]->(type:Cuisine)

WHERE person.name = 'Philip'
  AND loc.location = 'New York'
  AND type.cuisine = 'Sushi'

RETURN restaurant.name, count(*) AS occurrence
ORDER BY occurrence DESC
LIMIT 5

Other factors that can be easily integrated in this query are favorites, allergies, ratings and closeness to my current position.