Refining results

This guide describes how to refine the results from Cypher® queries by filtering, ordering, comparing, aliasing, and more.

Example dataset

After you create a free Aura instance, use the "Connect" button and select "Query". In the Cypher editor, copy and paste the following Cypher and execute the query:

CREATE (matrix:Movie {title: 'The Matrix', released: 1997})
CREATE (cloudAtlas:Movie {title: 'Cloud Atlas', released: 2012})
CREATE (forrestGump:Movie {title: 'Forrest Gump', released: 1994})
CREATE (larryCrowne:Movie {title: 'Larry Crowne', released: 2011})
CREATE (keanu:Person {name: 'Keanu Reeves', born: 1964})
CREATE (robert:Person {name: 'Robert Zemeckis', born: 1951})
CREATE (tom:Person {name: 'Tom Hanks', born: 1956})
CREATE (tom)-[:ACTED_IN {roles: ['Forrest']}]->(forrestGump)
CREATE (tom)-[:ACTED_IN {roles: ['Zachry']}]->(cloudAtlas)
CREATE (tom)-[:ACTED_IN {roles: ['Larry Crowne']}]->(larryCrowne)
CREATE (tom)-[:DIRECTED]->(larryCrowne)
CREATE (robert)-[:DIRECTED]->(forrestGump)

CREATE (diana:Person {name: "Diana"})
CREATE (melissa:Person {name: "Melissa", twitter: "@melissa"})
CREATE (dan:Person {name: "Dan", twitter: "@dan", yearsExperience: 6})
CREATE (sally:Person {name: "Sally", yearsExperience: 4})
CREATE (john:Person {name: "John", yearsExperience: 5})
CREATE (jennifer:Person {name: "Jennifer", twitter: "@jennifer", yearsExperience: 5})
CREATE (joe:Person {name: "Joe"})
CREATE (mark:Person {name: "Mark", twitter: "@mark"})
CREATE (ann:Person {name: "Ann"})
CREATE (xyz:Company {name: "XYZ"})
CREATE (x:Company {name: "Company X"})
CREATE (a:Company {name: "Company A"})
CREATE (Neo4j:Company {name: "Neo4j"})
CREATE (abc:Company {name: "ABC"})
CREATE (query:Technology {type: "Query Languages"})
CREATE (etl:Technology {type: "Data ETL"})
CREATE (integrations:Technology {type: "Integrations"})
CREATE (graphs:Technology {type: "Graphs"})
CREATE (dev:Technology {type: "Application Development"})
CREATE (java:Technology {type: "Java"})
CREATE (diana)-[:LIKES]->(query)
CREATE (melissa)-[:LIKES]->(query)
CREATE (dan)-[:LIKES]->(etl)<-[:LIKES]-(melissa)
CREATE (xyz)<-[:WORKS_FOR]-(sally)-[:LIKES]->(integrations)<-[:LIKES]-(dan)
CREATE (sally)<-[:IS_FRIENDS_WITH]-(john)-[:LIKES]->(java)
CREATE (john)<-[:IS_FRIENDS_WITH]-(jennifer)-[:LIKES]->(java)
CREATE (john)-[:WORKS_FOR]->(xyz)
CREATE (sally)<-[:IS_FRIENDS_WITH]-(jennifer)-[:IS_FRIENDS_WITH]->(melissa)
CREATE (joe)-[:LIKES]->(query)
CREATE (x)<-[:WORKS_FOR]-(diana)<-[:IS_FRIENDS_WITH]-(joe)-[:IS_FRIENDS_WITH]->(mark)-[:LIKES]->(graphs)<-[:LIKES]-(jennifer)-[:WORKS_FOR {startYear: 2017}]->(Neo4j)
CREATE (ann)<-[:IS_FRIENDS_WITH]-(jennifer)-[:IS_FRIENDS_WITH]->(mark)
CREATE (john)-[:LIKES]->(dev)<-[:LIKES]-(ann)-[:IS_FRIENDS_WITH]->(dan)-[:WORKS_FOR]->(abc)
CREATE (ann)-[:WORKS_FOR]->(abc)
CREATE (a)<-[:WORKS_FOR]-(melissa)-[:LIKES]->(graphs)<-[:LIKES]-(diana)

Filter

You can filter results and return only a subset of data by using the WHERE subclause or the FILTER (available only in Cypher 25) clause.

Property

You can filter results by property. For example, to find which people in the example dataset have Twitter handles, you can use the following query:

MATCH (p:Person)
RETURN p.name, p.twitter

Since you MATCH all Person nodes, they are all returned, regardless of whether they have a twitter property or not. Out of the 12 Person nodes in the graph, only four (Melissa, Dan, Jennifer, Mark) contain a value for the twitter property. Since the query matches and returns all Person nodes, you get another eight null results.

In Neo4j, properties only exist (are stored) if they have a value. A null property value is not stored. This ensures that only valuable, necessary information is retained for your nodes and relationships.

To return only results that contain a value, you can use the WHERE subclause and specify that it returns only results that are not null:

MATCH (p:Person)
WHERE p.twitter IS NOT NULL
RETURN p.name, p.twitter

Or, alternatively, use FILTER with Cypher 25:

CYPHER 25
MATCH (p:Person)
FILTER p.twitter IS NOT NULL
RETURN p.name, p.twitter

If you are using any other Cypher version, you need to add CYPHER 25 to the first line of the query. This does not change the default language version, it only ensures that this query runs in Cypher 25. Otherwise, the clause FILTER, which is exclusive to this Cypher version, won’t work.

To see a comparison between FILTER and WHERE, refer to Cypher → FILTER.

The result is:

p.name p.twitter

p.name	p.twitter
`"Melissa"`	`"@melissa"`
`"Dan"`	`"@dan"`
`"Jennifer"`	`"@jennifer"`
`"Mark"`	`"@mark"`
Rows: 4

"Melissa"

"@melissa"

"Dan"

"@dan"

"Jennifer"

"@jennifer"

"Mark"

"@mark"

Rows: 4

Strings and partial values

You can filter results based on parts of the value stored in a node or relationship using specific predicates together with WHERE:

`STARTS WITH`

The STARTS WITH operator searches for property values that begin with a string you specify, for example, the letter "M":

MATCH (p:Person)
WHERE p.name STARTS WITH 'M'
RETURN p.name;

The result is:

p.name

p.name
`"Melissa"`
`"Mark"`
Rows: 2

"Melissa"

"Mark"

Rows: 2

`CONTAINS`

The CONTAINS operator checks if a specified string is part of a property value, for example, if it contains the letter "a":

MATCH (p:Person)
WHERE p.name CONTAINS 'a'
RETURN p.name;

The result is:

p.name

p.name
`"Keanu Reeves"`
`"Tom Hanks"`
`"Diana"`
`"Melissa"`
`"Dan"`
`"Sally"`
`"Mark"`
Rows: 7

"Keanu Reeves"

"Tom Hanks"

"Diana"

"Melissa"

"Dan"

"Sally"

"Mark"

Rows: 7

`ENDS_WITH`

The ENDS_WITH operator searches for a specified string at the end of a property value, for example, a name that ends with "n":

MATCH (p:Person)
WHERE p.name ENDS WITH 'n'
RETURN p.name;

The result is:

p.name

p.name
`"Dan"`
`"John"`
`"Ann"`
Rows: 3

"Dan"

"John"

"Ann"

Rows: 3

Alternatively, you can use regular expressions, without string operators, to find a property value. For example, if you want to find any names that include "Jo" while respecting case-sensitivity (queries are case sensitive by default):

MATCH (p:Person)
WHERE p.name =~ 'Jo.*'
RETURN p.name

p.name

p.name
`"John"`
`"Joe"`
Rows: 2

"John"

"Joe"

Rows: 2

If you don’t want your query to be case sensitive, you can use toLower() with CONTAINS:

MATCH (p:Person)
WHERE toLower(p.name) CONTAINS "jo"
RETURN p.name

Or, alternatively, use toUpper():

MATCH (p:Person)
WHERE toUpper(p.name) CONTAINS "JO"
RETURN p.name

Both options return all name property values that contain "jo" regardless of the letter case.

`IN`

You can use the IN operator to test whether a property exists in a given list of values. For example, if any Person node in the graph has "1", "5" or "6" values for yearsExperience:

MATCH (p:Person)
WHERE p.yearsExperience IN [1, 5, 6]
RETURN p.name, p.yearsExperience

The result shows that no one has 1 year of experience, but three people have 5 or 6:

p.name p.yearsExperience

p.name	p.yearsExperience
`"Dan"`	`5`
`"John"`	`6`
`"Jennifer"`	`5`
Rows: 3

"Dan"

5

"John"

6

"Jennifer"

5

Rows: 3

Patterns

For example, to find people who are friends with someone who works for Neo4j, you can filter on whether there exists a [:WORKS_FOR]→(:Company {name:'Neo4j'}) relationship from the p:Person in the friendship:

MATCH (p:Person)-[r:IS_FRIENDS_WITH]->(friend:Person)
WHERE exists((p)-[:WORKS_FOR]->(:Company {name: 'Neo4j'}))
RETURN p, r, friend;

Which gives you the result:

Alternatively, you can also filter the result by excluding certain patterns from the results: For example, if you want to find which of Jennifer’s friends do not work for any company, use the following query:

MATCH (p:Person)-[r:IS_FRIENDS_WITH]->(friend:Person)
WHERE p.name = 'Jennifer'
AND NOT exists((friend)-[:WORKS_FOR]->(:Company))
RETURN friend.name;

You get only "Mark" as the result.

`OPTIONAL` patterns

Sometimes you want to return results from queries even if they don’t meet all the criteria. For that, you can use an OPTIONAL MATCH which returns null values where there is no match for the optional clause.

For example, if you search for people whose name starts with a "J" and who optionally work for a company:

MATCH (p:Person)
WHERE p.name STARTS WITH 'J'
OPTIONAL MATCH (p)-[:WORKS_FOR]-(other:Company)
RETURN p.name, other.name;

You get the following result:

p.name other.name

p.name	other.name
`"Jennifer"`	`"Neo4j"`
`"John"`	`"XYZ"`
`"Joe"`	`null`
Rows: 3

"Jennifer"

"Neo4j"

"John"

"XYZ"

"Joe"

null

Rows: 3

Notice that Joe is returned because his name starts with the letter 'J', but his company’s name is null. This is because he does not have a WORKS_FOR relationship to a COMPANY node. Since you used OPTIONAL MATCH, his Person node is still returned from the first match, but the optional match is not found, so a null result is returned for other.name.

To see the difference, try running the query without the OPTIONAL in front of the second match. Joe’s row is no longer returned in the result. This is because Cypher reads the statement with an AND match, so the person must match the first criterium (name starts with "J") and the second criterium (person works for a company).

More complex patterns

You can query for patterns that have more than one relationship. For example, if you want to know who works for the same company and are also friends:

MATCH (p1:Person)-[:WORKS_FOR]->(company:Company)<-[:WORKS_FOR]-(p2:Person),
(p1)-[:IS_FRIENDS_WITH]->(p2)
RETURN p1.name AS people1, p2.name AS people2, company.name AS company

The result is:

people1 people2 company

people1	people2	company
`"John"`	`"Sally"`	`"XYZ"`
`"Ann"`	`"Dan"`	`"ABC"`
Rows: 2

"John"

"Sally"

"XYZ"

"Ann"

"Dan"

"ABC"

Rows: 2

Note that the query has a comma at the end of the first line, and another pattern is added to MATCH on the next line. This allows you to chain different patterns together, similar to when you used WHERE exists(<pattern>) previously.

With this structure, you can add multiple different patterns and link them together, allowing you to traverse various pieces of the graph with certain patterns.

For more information on how to query for patterns, see Cypher → Patterns.

Compare values

Equality

You can use comparisons to filter values, starting with equality:

MATCH (m:Movie)
WHERE m.title = 'The Matrix'
RETURN m

The result is:

m
`(:Movie {title: "The Matrix", released: 1997})`
Rows: 1

(:Movie {title: "The Matrix", released: 1997})

Rows: 1

Numerical

Another option is the numerical comparison to confirm the existence of values within a list.

The WHERE clause in the following example includes a greater-than comparison to find which movies were released after the year "2000":

MATCH (m:Movie)
WHERE m.released > 2000
RETURN m.title, m.released

The result is:

m m.release

m	m.release
`"Cloud Atlas"`	`2012`
`"Larry Crowne"`	`2011`
Rows: 2

"Cloud Atlas"

2012

"Larry Crowne"

2011

Rows: 2

Cypher has other mathematical operators that allow for other types of numerical comparison. See Cypher → Mathematical operators for the complete list of operators and examples.

Boolean

Cypher’s boolean operators allow you to combine or evaluate logical conditions.

For example, to find who in the graph has directed a movie but didn’t act in it:

MATCH (p:Person)-[:DIRECTED]->(m)
WHERE NOT (p)-[:ACTED_IN]->(m)
RETURN p, m

The result is:

This is because the query looks for Person nodes that are connected to Movie nodes through the DIRECTED relationship only and not ACTED_IN. If the ACTED_IN relationship wasn’t a restriction anymore:

MATCH (p:Person)-[:DIRECTED]->(m)
RETURN p, m

Then the result includes Tom Hanks and the movie "Larry Crowne" because Tom Hanks both directed and acted in the movie:

See Cypher → Boolean operators for the complete list of operators and examples.

Range of values

You can query data within a certain range in number or date, for example, if you want to find events within a certain timeline, age values, etc.

If you want to know who has experience within the range of three to seven years:

MATCH (p:Person)
WHERE 3 <= p.yearsExperience <= 7
RETURN p

The result is:

Aliasing results

By default, Cypher returns a label as the name of the column in the tabular results. However, you can use an alias to make the results more understandable by using AS.

For example, if you want to have the tabular result with a column titled name instead of the label p.name:

MATCH (p:Person)
RETURN p.name AS name

The result is:

name

name
`"Keanu Reeves"`
`"Robert Zemeckis"`
`"Tom Hanks"`
`"Diana"`
`"Melissa"`
`"Dan"`
`"Sally"`
`"John"`
`"Jennifer"`
`"Joe"`
`"Mark"`
`"Ann"`
Rows: 12

"Keanu Reeves"

"Robert Zemeckis"

"Tom Hanks"

"Diana"

"Melissa"

"Dan"

"Sally"

"John"

"Jennifer"

"Joe"

"Mark"

"Ann"

Rows: 12

Avoiding duplication

For example, to find which people are connected through a IS_FRIENDS_WITH relationship:

MATCH (p:Person)-[:IS_FRIENDS_WITH]-(friend:Person)
RETURN friend.name;

The result is:

friend.name

friend.name
`"Joe"`
`"Jennifer"`
`"Ann"`
`"John"`
`"Jennifer"`
`"Sally"`
`"Jennifer"`
`"Melissa"`
`"Sally"`
`"John"`
`"Mark"`
`"Ann"`
`"Diana"`
`"Mark"`
`"Jennifer"`
`"Joe"`
`"Dan"`
`"Jennifer"`
Rows: 18

"Joe"

"Jennifer"

"Ann"

"John"

"Jennifer"

"Sally"

"Jennifer"

"Melissa"

"Sally"

"John"

"Mark"

"Ann"

"Diana"

"Mark"

"Jennifer"

"Joe"

"Dan"

"Jennifer"

Rows: 18

Note that some names appear more than once. For example, "Sally" appears twice because she is connected to two people through two IS_FRIENDS_WITH relationships. To avoid this, add DISTINCT after RETURN:

MATCH (p:Person)-[:IS_FRIENDS_WITH]->(friend:Person)
RETURN DISTINCT friend.name;

The result is now:

friend.name

friend.name
`"Sally"`
`"Melissa"`
`"John"`
`"Mark"`
`"Ann"`
`"Diana"`
`"Dan"`
Rows: 7

"Sally"

"Melissa"

"John"

"Mark"

"Ann"

"Diana"

"Dan"

Rows: 7

Order

If you want to have the results ordered, you can use the clause ORDER BY after the RETURN clause. For example, to list how many years of experience each person has in descending order:

MATCH (p:Person)
RETURN p.yearsExperience AS yearsExperience
ORDER BY yearsExperience DESC

This is the result:

yearsExperience

yearsExperience
`null`
`null`
`null`
`null`
`null`
`null`
`null`
`null`
`6`
`5`
`5`
`4`
Rows: 12

null

6

5

4

Rows: 12

Note that the first 8 rows contain null as result. This means that not all returned Person nodes have the property yearsExperience.

Since null is considered the highest possible value when sorting in descending order, they are returned before the numerical values. If the order was ascending, null values would appear last.

Optionally, you can avoid returning null values by using WHERE and IS NOT NULL:

MATCH (p:Person)
WHERE p.yearsExperience IS NOT NULL
RETURN p.yearsExperience AS yearsExperience
ORDER BY yearsExperience DESC

Limit

You can use LIMIT when you want to restrict the number of results returned. For example, the current example graph has twelve Person nodes and if you want to retrieve only three:

MATCH (p:Person)
RETURN p
LIMIT 3

The result is:

p
`(:Person {born: 1964, name: "Keanu Reeves"})`
`(:Person {born: 1951, name: "Robert Zemeckis"})`
`(:Person {born: 1956, name: "Tom Hanks"})`
Rows: 3

(:Person {born: 1964, name: "Keanu Reeves"})

(:Person {born: 1951, name: "Robert Zemeckis"})

(:Person {born: 1956, name: "Tom Hanks"})

Rows: 3

Since you didn’t use ORDER BY to specify an order to the results, the query retrieves the first three Person nodes that were added to the graph. To get random results, you can add ORDER BY rand() to the query:

MATCH (p:Person)
RETURN p
ORDER BY rand()
LIMIT 3

Now every time you run this query, you get three random Person nodes.

Aggregate

Cypher provides several aggregating functions that let you calculate a single value from a set of values.

You can use them with the RETURN or the WITH clauses to aggregate results.

Function Description

Function	Description
`avg()`	Returns the average of a set of `INTEGER`, `FLOAT` or `DURATION` values.
`collect()`	Returns a list containing the values returned by an expression.
`count()`	Returns the number of values or rows.
`max()`	Returns the maximum value in a set of values.
`min()`	Returns the minimum value in a set of values.
`percentileCont()`	Returns the percentile of a value over a group using linear interpolation.
`percentileDisc()`	Returns the nearest `INTEGER` or `FLOAT` value to the given percentile over a group using a rounding method.
`stDev()`	Returns the standard deviation for the given value over a group for a sample of a population.
`stDevP()`	Returns the standard deviation for the given value over a group for an entire population.
`sum()`	Returns the sum of a set of INTEGER, FLOAT or DURATION values.

avg()

Returns the average of a set of INTEGER, FLOAT or DURATION values.

collect()

Returns a list containing the values returned by an expression.

count()

Returns the number of values or rows.

max()

Returns the maximum value in a set of values.

min()

Returns the minimum value in a set of values.

percentileCont()

Returns the percentile of a value over a group using linear interpolation.

percentileDisc()

Returns the nearest INTEGER or FLOAT value to the given percentile over a group using a rounding method.

stDev()

Returns the standard deviation for the given value over a group for a sample of a population.

stDevP()

Returns the standard deviation for the given value over a group for an entire population.

sum()

Returns the sum of a set of INTEGER, FLOAT or DURATION values.

Refer to Cypher → Aggregating for the complete list of all aggregating functions and examples.

Unwind

If you have a list that you want to inspect or separate the values, Cypher offers the UNWIND function for that.

For example, if you want to see which people in the example graph like "Graphs" and/or "Query Languages":

WITH ['Graphs','Query Languages'] AS likedTech
UNWIND likedTech AS technology
MATCH (p:Person)-[r:LIKES]-(t:Technology {type: technology})
RETURN t.type, p.name AS people;

The result is:

t.type people

t.type	people
`"Graphs"`	`"Diana"`
`"Graphs"`	`"Melissa"`
`"Graphs"`	`"Jennifer"`
`"Graphs"`	`"Mark"`
`"Query Languages"`	`"Diana"`
`"Query Languages"`	`"Melissa"`
`"Query Languages"`	`"Joe"`
Rows: 7

"Graphs"

"Diana"

"Graphs"

"Melissa"

"Graphs"

"Jennifer"

"Graphs"

"Mark"

"Query Languages"

"Diana"

"Query Languages"

"Melissa"

"Query Languages"

"Joe"

Rows: 7

You can use the collect() function to group results together. If you rewrite the query like this:

WITH ['Graphs','Query Languages'] AS likedTech
UNWIND likedTech AS technology
MATCH (p:Person)-[r:LIKES]-(t:Technology {type: technology})
RETURN t.type, collect(p.name) AS people;

The result looks tidier:

t.type people

t.type	people
`"Graphs"`	`["Diana", "Melissa", "Jennifer", "Mark"]`
`"Query Languages"`	`["Diana", "Melissa", "Joe"]`
Rows: 2

"Graphs"

["Diana", "Melissa", "Jennifer", "Mark"]

"Query Languages"

["Diana", "Melissa", "Joe"]

Rows: 2

Number of items

If you have a list of elements, you can also return the number of elements in that list using the size() function.

For example, if you want to know how many friends each person in the graph has:

MATCH (p:Person)-[:IS_FRIENDS_WITH]-(friend:Person)
RETURN p.name, size(collect(friend.name)) AS numberOfFriends;

Note that the collect() function is used inside the size() function. If you didn’t have it, that is, only size(friend.name), what the function would do is to count how many characters the string values of the friend.name properties have. Since you want to know how many friends each person has, then collect() works as an aggregating function that aggregates all friend.name values and size() returns the amount in number.

The result is:

p.name numberOfFriends

p.name	numberOfFriends
`"Diana"`	`1`
`"Melissa"`	`1`
`"Dan"`	`1`
`"Sally"`	`2`
`"John"`	`2`
`"Jennifer"`	`5`
`"Joe"`	`2`
`"Mark"`	`2`
`"Ann"`	`1`
Rows: 9

"Diana"

1

"Melissa"

1

"Dan"

1

"Sally"

2

"John"

2

"Jennifer"

5

"Joe"

2

"Mark"

2

"Ann"

1

Rows: 9

Another point to note is that if you add a direction to the relationship:

MATCH (p:Person)-[:IS_FRIENDS_WITH]->(friend:Person)
RETURN p.name, size(collect(friend.name)) AS numberOfFriends;

You get a different result:

p.name numberOfFriends

p.name	numberOfFriends
`"John"`	`1`
`"Jennifer"`	`5`
`"Joe"`	`2`
`"Ann"`	`1`
Rows:4

"John"

1

"Jennifer"

5

"Joe"

2

"Ann"

1

Rows:4

This is because the IS_FRIENDS_WITH relationship is directed.

For example, Jennifer is connected to Mark through an outgoing IS_FRIENDS_WITH relationship from Jennifer to Mark (i.e. (jennifer)-[:IS_FRIENDS_WITH]→(mark)), not (mark)-[:IS_FRIENDS_WITH]→(jennifer).

In practice, the fact that Jennifer and Mark are friends is what matters, so the relationship’s direction is irrelevant. However, when adding this information to the graph, the relationship must have a direction. On the other hand, querying for it doesn’t require direction.

Creating a new pattern with the opposite direction (i.e. (mark)-[:IS_FRIENDS_WITH]→(jennifer)) wouldn’t be incorrect from a syntax perspective, but it’s duplicate information when the direction is irrelevant to answer the question "who is friends with whom?". This is why a more accurate result is achieved by using a query with an undirected relationship.

Keep learning

Cypher has many other resources for refining query results. If you want to keep learning, see the following pages:

Composed queries → See how to use UNION, WHEN, and NEXT to combine queries, create linear compositions and different query branches.
SKIP → Defines from which row to start including the rows in the output.
Expressions → Operators and expressions that can be used to evaluate values.
Functions → Full list of existing Cypher functions.
Indexes → Strategies for speeding up data retrieval.

Refining results

Example dataset

Filter

Property

Strings and partial values

STARTS WITH

CONTAINS

ENDS_WITH

IN

Patterns

OPTIONAL patterns

More complex patterns

Compare values

Equality

Numerical

Boolean

Range of values

Aliasing results

Avoiding duplication

Order

Limit

Aggregate

Unwind

Number of items

Keep learning

`STARTS WITH`

`CONTAINS`

`ENDS_WITH`

`IN`

`OPTIONAL` patterns