About this module

Now that you have a basic understanding of the data, you will build a recommendation engine for authors.

At the end of this module, you should be able to write Cypher queries to:

  • Find potential collaborators for an author.

  • Find relevant papers about a topic for an author.

Collaborative Filtering

Collaborative filtering is based on the idea that people like things similar to other things they like, and things that are liked by other people with similar taste.

Collaborative Filtering

In the Citations dataset there are a couple of different recommendations that we can generate.

Suggested Articles

Authors may be interested in reading other papers written by their coauthors. We can also recommend future collaborators by finding the people that their coauthors have collaborated with.

Exercise 1: Coauthor Collaborative Filtering with Cypher

You will write recommendation queries to suggest potential collaborators.

Open the 03_Recommendations_Part1.ipynb notebook and perform the steps of the exercise.

Once you have attempted the exercises, you can see the answers by launching the 03_Recommendations_Part1_Solutions.ipynb notebook.

PageRank and Personalized PageRank

PageRank is an algorithm that measures the transitive influence or connectivity of nodes. It can be computed by either iteratively distributing one node’s rank (originally based on degree) over its neighbors or by randomly traversing the graph and counting the frequency of hitting each node during these walks.

Personalized PageRank (PPR) is a variant of this algorithm that is biased towards a set of source nodes. It is often used as part of building recommendation systems.

Exercise 2: Article recommendations with Personalized PageRank

In this exercise, you will gain experience using the PageRank algorithm, understand the difference between PageRank and PPR, and use PPR to suggest relevant articles to an author.

Launch the 03_Recommendations_Part2.ipynb notebook and perform the steps of the exercise.

Check your understanding

Question 1

How many of Brian Fitzgerald’s potential collaborators have collaborated with Brian’s collaborators more than 3 times?

Select the correct answer.

  • 12

  • 8

  • 0

  • 7

Question 2

If we wanted to create a full text search on the 'name' property of nodes with the label 'Author', what are the correct procedures to do this?

Select the correct answers.

  • CALL db.index.fulltext.createNodeIndex('authors', ['Author'], ['name'])

  • CALL db.index.fulltext.createNodeIndex('authors', ['name'], ['Author'])

  • CALL db.index.fulltext.createNodeIndex('authorName', ['Author'], ['name'])

  • CALL db.index.createFullTextSearch('authors', ['Author'], ['name'])

Question 3

Which statement describes the Personalized PageRank algorithm?

Select the correct answer.

  • Personalized PageRank measures the number of incoming and outgoing relationships from a node.

  • Personalized PageRank is a variant of PageRank that allows us to find influential nodes based on a set of source nodes.

  • Personalized PageRank counts the number of neighbors within 2 hops of a node

  • Personalized PageRank can only be used in combination with Full Text Search


You should now be able to:

  • Find potential collaborators for an author.

  • Find relevant papers about a topic for an author.