Online Course Introduction to Neo4j 4.0 Neo4j is a Graph Database The Neo4j Graph Platform Introduction to Cypher Using WHERE to Filter Queries Working with Patterns in Queries Working with Cypher Data Controlling the Query Chain Controlling Results Returned Creating… Read more →

Controlling the Query Chain

About this module

You have learned some of the more common ways to write Cypher code that queries the graph and aggregates results. You have learned how to work with different data types in Cypher. Next, you will learn how you can use intermediate results in a query to perform additional querying so that you can have a chain of queries before a result is returned.

At the end of this module, you should be able to write Cypher statements to:

  • Perform intermediate processing with WITH.
  • Using WITH and UNWIND for query processing.
  • Perform subqueries with WITH.
  • Perform subqueries with CALL.

Intermediate processing using WITH

Here is an example where we return each actor, the number of movies they acted in, and the titles of the movies:

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN a.name, count(a) AS numMovies,
       collect(m.title) as movies

Here is the result returned:

WithoutWITH

What if we wanted to further qualify the query so that we only return the actors who have made 2 or 3 movies? What we really want to do is test the count of each movie, but the count is an aggregating function so we cannot test it until all nodes have been retrieved. One way to to this is with the WITH clause in Cypher.

Example: Using WITH

During the execution of a MATCH clause, you can specify that you want some intermediate calculations or values that will be used for further processing of the query. You use the WITH clause to perform intermediate processing that is not possible in a RETURN clause.

Here is an example where we start the query processing by retrieving all actors and their movies. During the query processing, want to only return actors that have 2 or 3 movies. All other actors and the aggregated results are filtered out. This type of query is a replacement for SQL’s “HAVING” clause. The WITH clause does the counting and collecting, and the intermediate result is used in the subsequent WHERE clause to test.

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH  a, count(a) AS numMovies, collect(m.title) as movies
WHERE 1 < numMovies < 4
RETURN a.name, numMovies, movies

Here is the result returned:

ActorsWith2or3Movies

When you use the WITH clause, you specify the variables from the previous part of the query you want to pass on to the next part of the query. In this example, the variable a is specified to be passed on in the query, but m is not. Since m is not specified to be passed on, m will not be available later in the query. Notice that for the RETURN clause, a, numMovies, and movies are available for use.

Note
You have to name all expressions with an alias in a WITH that are existing variables.

Using WITH and UNWIND

You have learned to create lists of nodes during a query using the collect() function. If you have collected a subset of nodes in a query, you can use UNWIND to return the rows for a collection.

Here is an example:

MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
WITH collect(p) AS actors,
count(p) AS actorCount, m
UNWIND actors AS actor
RETURN m.title, actorCount, actor.name

The query retrieves all people who acted in movies. During the query, for each movie, the list of actors are collected, as well as the count of actors. The WITH clause makes the variables actors, actorCount, and m available for the rest of the query processing. The UNWIND clause turns the list of actors into rows of actors. Then the query returns the title of the movie, the actor count, and the actor name for each row of the actors collection.

Here is the result returned:

WITHAndUNWIND

You find that you use WITH and UNWIND frequently when you are importing data into the graph.

Example: Subqueries with WITH

Here is an example where we retrieve all movies reviewed by a person. For a particular movie found, we want the list of directors of the movie so we do a second query, a subquery as follows:

MATCH (m:Movie)<-[rv:REVIEWED]-(r:Person)
WITH m, rv, r
MATCH (m)<-[:DIRECTED]-(d:Person)
RETURN m.title, rv.rating, r.name, collect(d.name)

For the second MATCH clause, we use the found movie nodes, m. The RETURN clause has access to the movie, rating by that reviewer, the name of the reviewer, and the collection of director names for that movie.

Here is the result returned:

WITH2MATCHES

Example: Another subquery

Here is another example where we want to find all actors who have acted in at least five movies, and find (optionally) the movies they directed and return the person and those movies.

MATCH (p:Person)
WITH p, size((p)-[:ACTED_IN]->()) AS movies
WHERE movies >= 5
OPTIONAL MATCH (p)-[:DIRECTED]->(m:Movie)
RETURN p.name, m.title

Here is the result returned:

PopularActorsWithAtLeast5Movies

In this example, we first retrieve all people, but then specify a pattern in the WITH clause where we calculate the number of :ACTED_IN relationships retrieved using the size() function. If this value is greater than five, we then also retrieve the :DIRECTED paths to return the name of the person and the title of the movie they directed. In the result, we see that these actors acted in more than five movies, but Tom Hanks is the only actor who directed a movie and thus the only person to have a value for the movie. Notice here that m only refers to movies that were directed by p.

Performing subqueries with CALL

Another way that you can perform a subquery is to use the CALL clause. In a CALL clause, you specify a query that returns, typically a set of nodes. The set of nodes returned in the CALL clause can be used for a subsequent query.

Here is an example:

CALL
{MATCH (p:Person)-[:REVIEWED]->(m:Movie)
RETURN  m}
MATCH (m) WHERE m.released=2000
RETURN m.title, m.released

Here is the result returned:

CALLsubquery

The variable m used in the subquery is used in the next query.

Exercise 7: Controlling query processing

In the query edit pane of Neo4j Browser, execute the browser command:

:play 4.0-intro-neo4j-exercises

and follow the instructions for Exercise 7.

Note
This exercise has 5 steps. Estimated time to complete: 15 minutes.

Check your understanding

Question 1

Given this code snippet, what variables can you use in the RETURN clause?

MATCH (a:Person)-[r:ACTED_IN]->(m:Movie)
WITH  a, count(a) AS numMovies
WHERE 1 < numMovies < 4
RETURN ??

Select the correct answers.

  • a
  • r
  • m
  • numMovies

Question 2

What clauses enable you to perform subqueries?

Select the correct answers.

  • SUBMATCH
  • WITH
  • QUERY
  • CALL

Question 3

Given this Cypher query, what Cypher clause do you use here to turn the list of movies into rows?

MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
WITH collect(m) AS movies,count(m) AS movieCount, p
?? movies AS movie
RETURN p.name, movieCount, movie.title

Select the correct answer.

  • ELEMENTS
  • UNWIND
  • ROWS
  • SELECT

Summary

You should now be able to write Cypher statements to:

  • Perform intermediate processing with WITH.
  • Using WITH and UNWIND for query processing.
  • Perform subqueries with WITH.
  • Perform subqueries with CALL.

Stay Connected

Sign up to find out more about Neo4j's upcoming events & meetups.