MATCH

Introduction

The MATCH clause allows you to specify the patterns Neo4j will search for in the database. This is the primary way of getting data into the current set of bindings. It is worth reading up more on the specification of the patterns themselves in Patterns.

MATCH is often coupled to a WHERE part which adds restrictions, or predicates, to the MATCH patterns, making them more specific. The predicates are part of the pattern description, and should not be considered a filter applied only after the matching is done. This means that WHERE should always be put together with the MATCH clause it belongs to.

MATCH can occur at the beginning of the query or later, possibly after a WITH. If it is the first clause, nothing will have been bound yet, and Neo4j will design a search to find the results matching the clause and any associated predicates specified in any WHERE part. This could involve a scan of the database, a search for nodes having a certain label, or a search of an index to find starting points for the pattern matching. Nodes and relationships found by this search are available as bound pattern elements, and can be used for pattern matching of paths. They can also be used in any further MATCH clauses, where Neo4j will use the known elements, and from there find further unknown elements.

Cypher® is declarative, and so usually the query itself does not specify the algorithm to use to perform the search. Neo4j will automatically work out the best approach to finding start nodes and matching patterns. Predicates in WHERE parts can be evaluated before pattern matching, during pattern matching, or after finding matches. However, there are cases where you can influence the decisions taken by the query compiler. Read more about indexes in Indexes for search performance, and more about specifying hints to force Neo4j to solve a query in a specific way in Planner hints and the USING keyword.

To understand more about the patterns used in the MATCH clause, read Patterns

The following graph is used for the examples below:

graph match clause

Basic node finding

Get all nodes

By just specifying a pattern with a single node and no labels, all nodes in the graph will be returned.

Query
MATCH (n)
RETURN n

Returns all the nodes in the database.

Table 1. Result
n

Node[0]{name:"Charlie Sheen"}

Node[1]{name:"Martin Sheen"}

Node[2]{name:"Michael Douglas"}

Node[3]{name:"Oliver Stone"}

Node[4]{name:"Rob Reiner"}

Node[5]{title:"Wall Street"}

Node[6]{title:"The American President"}

Rows: 7

Get all nodes with a label

Getting all nodes with a label on them is done with a single node pattern where the node has a label on it.

Query
MATCH (movie:Movie)
RETURN movie.title

Returns all the movies in the database.

Table 2. Result
movie.title

"Wall Street"

"The American President"

Rows: 2

The symbol -- means related to, without regard to type or direction of the relationship.

Query
MATCH (director {name: 'Oliver Stone'})--(movie)
RETURN movie.title

Returns all the movies directed by 'Oliver Stone'.

Table 3. Result
movie.title

"Wall Street"

Rows: 1

Match with labels

To constrain your pattern with labels on nodes, you add it to your pattern nodes, using the label syntax.

Query
MATCH (:Person {name: 'Oliver Stone'})--(movie:Movie)
RETURN movie.title

Returns any nodes connected with the Person 'Oliver' that are labeled Movie.

Table 4. Result
movie.title

"Wall Street"

Rows: 1

Relationship basics

Outgoing relationships

When the direction of a relationship is of interest, it is shown by using --> or <--, like this:

Query
MATCH (:Person {name: 'Oliver Stone'})-->(movie)
RETURN movie.title

Returns any nodes connected with the Person 'Oliver' by an outgoing relationship.

Table 5. Result
movie.title

"Wall Street"

Rows: 1

Directed relationships and variable

If a variable is required, either for filtering on properties of the relationship, or to return the relationship, this is how you introduce the variable.

Query
MATCH (:Person {name: 'Oliver Stone'})-[r]->(movie)
RETURN type(r)

Returns the type of each outgoing relationship from 'Oliver'.

Table 6. Result
type(r)

"DIRECTED"

Rows: 1

Match on relationship type

When you know the relationship type you want to match on, you can specify it by using a colon together with the relationship type.

Query
MATCH (wallstreet:Movie {title: 'Wall Street'})<-[:ACTED_IN]-(actor)
RETURN actor.name

Returns all actors that ACTED_IN 'Wall Street'.

Table 7. Result
actor.name

"Michael Douglas"

"Martin Sheen"

"Charlie Sheen"

Rows: 3

Match on multiple relationship types

To match on one of multiple types, you can specify this by chaining them together with the pipe symbol |.

Query
MATCH (wallstreet {title: 'Wall Street'})<-[:ACTED_IN|DIRECTED]-(person)
RETURN person.name

Returns nodes with an ACTED_IN or DIRECTED relationship to 'Wall Street'.

Table 8. Result
person.name

"Oliver Stone"

"Michael Douglas"

"Martin Sheen"

"Charlie Sheen"

Rows: 4

Match on relationship type and use a variable

If you both want to introduce an variable to hold the relationship, and specify the relationship type you want, just add them both, like this:

Query
MATCH (wallstreet {title: 'Wall Street'})<-[r:ACTED_IN]-(actor)
RETURN r.role

Returns ACTED_IN roles for 'Wall Street'.

Table 9. Result
r.role

"Gordon Gekko"

"Carl Fox"

"Bud Fox"

Rows: 3

Relationships in depth

Inside a single pattern, relationships will only be matched once.

Relationship types with uncommon characters

Sometimes your database will have types with non-letter characters, or with spaces in them. Use ` (backtick) to quote these. To demonstrate this we can add an additional relationship between 'Charlie Sheen' and 'Rob Reiner':

Query
MATCH
  (charlie:Person {name: 'Charlie Sheen'}),
  (rob:Person {name: 'Rob Reiner'})
CREATE (rob)-[:`TYPE INCLUDING A SPACE`]->(charlie)

Which leads to the following graph:

graph match clause backtick

Query
MATCH (n {name: 'Rob Reiner'})-[r:`TYPE INCLUDING A SPACE`]->()
RETURN type(r)

Returns a relationship type with spaces in it.

Table 10. Result
type(r)

"TYPE INCLUDING A SPACE"

Rows: 1

Multiple relationships

Relationships can be expressed by using multiple statements in the form of ()--(), or they can be strung together, like this:

Query
MATCH (charlie {name: 'Charlie Sheen'})-[:ACTED_IN]->(movie)<-[:DIRECTED]-(director)
RETURN movie.title, director.name

Returns the movie 'Charlie Sheen' acted in and its director.

Table 11. Result
movie.title director.name

"Wall Street"

"Oliver Stone"

Rows: 1

Variable length relationships

Nodes that are a variable number of relationship->node hops away can be found using the following syntax: -[:TYPE*minHops..maxHops]->. minHops and maxHops are optional and default to 1 and infinity respectively. When no bounds are given the dots may be omitted. The dots may also be omitted when setting only one bound and this implies a fixed length pattern.

Query
MATCH (charlie {name: 'Charlie Sheen'})-[:ACTED_IN*1..3]-(movie:Movie)
RETURN movie.title

Returns all movies related to 'Charlie Sheen' by 1 to 3 hops.

Table 12. Result
movie.title

"Wall Street"

"The American President"

"The American President"

Rows: 3

Variable length relationships with multiple relationship types

Variable length relationships can be combined with multiple relationship types. In this case the *minHops..maxHops applies to all relationship types as well as any combination of them.

Query
MATCH (charlie {name: 'Charlie Sheen'})-[:ACTED_IN|DIRECTED*2]-(person:Person)
RETURN person.name

Returns all people related to 'Charlie Sheen' by 2 hops with any combination of the relationship types ACTED_IN and DIRECTED.

Table 13. Result
person.name

"Oliver Stone"

"Michael Douglas"

"Martin Sheen"

Rows: 3

Relationship variable in variable length relationships

When the connection between two nodes is of variable length, the list of relationships comprising the connection can be returned using the following syntax:

Query
MATCH p = (actor {name: 'Charlie Sheen'})-[:ACTED_IN*2]-(co_actor)
RETURN relationships(p)

Returns a list of relationships.

Table 14. Result
relationships(p)

[:ACTED_IN[0]{role:"Bud Fox"},:ACTED_IN[2]{role:"Gordon Gekko"}]

[:ACTED_IN[0]{role:"Bud Fox"},:ACTED_IN[1]{role:"Carl Fox"}]

Rows: 2

Match with properties on a variable length path

A variable length relationship with properties defined on in it means that all relationships in the path must have the property set to the given value. In this query, there are two paths between 'Charlie Sheen' and his father 'Martin Sheen'. One of them includes a 'blocked' relationship and the other does not. In this case we first alter the original graph by using the following query to add BLOCKED and UNBLOCKED relationships:

Query
MATCH
  (charlie:Person {name: 'Charlie Sheen'}),
  (martin:Person {name: 'Martin Sheen'})
CREATE (charlie)-[:X {blocked: false}]->(:UNBLOCKED)<-[:X {blocked: false}]-(martin)
CREATE (charlie)-[:X {blocked: true}]->(:BLOCKED)<-[:X {blocked: false}]-(martin)

This means that we are starting out with the following graph:

graph match clause variable length

Query
MATCH p = (charlie:Person)-[* {blocked:false}]-(martin:Person)
WHERE charlie.name = 'Charlie Sheen' AND martin.name = 'Martin Sheen'
RETURN p

Returns the paths between 'Charlie Sheen' and 'Martin Sheen' where all relationships have the blocked property set to false.

Table 15. Result
p

(0)-[X,7]->(7)<-[X,8]-(1)

Rows: 1

Zero length paths

Using variable length paths that have the lower bound zero means that two variables can point to the same node. If the path length between two nodes is zero, they are by definition the same node. Note that when matching zero length paths the result may contain a match even when matching on a relationship type not in use.

Query
MATCH (wallstreet:Movie {title: 'Wall Street'})-[*0..1]-(x)
RETURN x

Returns the movie itself as well as actors and directors one relationship away

Table 16. Result
x

Node[5]{title:"Wall Street"}

Node[3]{name:"Oliver Stone"}

Node[2]{name:"Michael Douglas"}

Node[1]{name:"Martin Sheen"}

Node[0]{name:"Charlie Sheen"}

Rows: 5

Named paths

If you want to return or filter on a path in your pattern graph, you can a introduce a named path.

Query
MATCH p = (michael {name: 'Michael Douglas'})-->()
RETURN p

Returns the two paths starting from 'Michael Douglas'

Table 17. Result
p

(2)-[ACTED_IN,2]->(5)

(2)-[ACTED_IN,5]->(6)

Rows: 2

Matching on a bound relationship

When your pattern contains a bound relationship, and that relationship pattern does not specify direction, Cypher will try to match the relationship in both directions.

Query
MATCH (a)-[r]-(b)
WHERE id(r) = 0
RETURN a, b

This returns the two connected nodes, once as the start node, and once as the end node

Table 18. Result
a b

Node[0]{name:"Charlie Sheen"}

Node[5]{title:"Wall Street"}

Node[5]{title:"Wall Street"}

Node[0]{name:"Charlie Sheen"}

Rows: 2

Shortest path

Single shortest path

Finding a single shortest path between two nodes is as easy as using the shortestPath function. It is done like this:

Query
MATCH
  (martin:Person {name: 'Martin Sheen'}),
  (oliver:Person {name: 'Oliver Stone'}),
  p = shortestPath((martin)-[*..15]-(oliver))
RETURN p

This means: find a single shortest path between two nodes, as long as the path is max 15 relationships long. Within the parentheses you define a single link of a path — the starting node, the connecting relationship and the end node. Characteristics describing the relationship like relationship type, max hops and direction are all used when finding the shortest path. If there is a WHERE clause following the match of a shortestPath, relevant predicates will be included in the shortestPath. If the predicate is a none() or all() on the relationship elements of the path, it will be used during the search to improve performance (see Shortest path planning).

Table 19. Result
p

(1)-[ACTED_IN,1]->(5)<-[DIRECTED,3]-(3)

Rows: 1

Single shortest path with predicates

Predicates used in the WHERE clause that apply to the shortest path pattern are evaluated before deciding what the shortest matching path is.

Query
MATCH
  (charlie:Person {name: 'Charlie Sheen'}),
  (martin:Person {name: 'Martin Sheen'}),
  p = shortestPath((charlie)-[*]-(martin))
WHERE none(r IN relationships(p) WHERE type(r) = 'FATHER')
RETURN p

This query will find the shortest path between 'Charlie Sheen' and 'Martin Sheen', and the WHERE predicate will ensure that we do not consider the father/son relationship between the two.

Table 20. Result
p

(0)-[ACTED_IN,0]->(5)<-[ACTED_IN,1]-(1)

Rows: 1

All shortest paths

Finds all the shortest paths between two nodes.

Query
MATCH
  (martin:Person {name: 'Martin Sheen'} ),
  (michael:Person {name: 'Michael Douglas'}),
  p = allShortestPaths((martin)-[*]-(michael))
RETURN p

Finds the two shortest paths between 'Martin Sheen' and 'Michael Douglas'.

Table 21. Result
p

(1)-[ACTED_IN,1]->(5)<-[ACTED_IN,2]-(2)

(1)-[ACTED_IN,4]->(6)<-[ACTED_IN,5]-(2)

Rows: 2

Get node or relationship by ID

Node by ID

Searching for nodes by ID can be done with the id() function in a predicate.

Neo4j reuses its internal IDs when nodes and relationships are deleted. This means that applications using, and relying on internal Neo4j IDs, are brittle or at risk of making mistakes. It is therefore recommended to rather use application-generated IDs.

Query
MATCH (n)
WHERE id(n) = 0
RETURN n

The corresponding node is returned.

Table 22. Result
n

Node[0]{name:"Charlie Sheen"}

Rows: 1

Relationship by ID

Search for relationships by ID can be done with the id() function in a predicate.

This is not the recommended practice. See Node by ID for more information on the use of Neo4j IDs.

Query
MATCH ()-[r]->()
WHERE id(r) = 0
RETURN r

The relationship with ID 0 is returned.

Table 23. Result
r

:ACTED_IN[0]{role:"Bud Fox"}

Rows: 1

Multiple nodes by ID

Multiple nodes are selected by specifying them in an IN-clause.

Query
MATCH (n)
WHERE id(n) IN [0, 3, 5]
RETURN n

This returns the nodes listed in the IN-expression.

Table 24. Result
n

Node[0]{name:"Charlie Sheen"}

Node[3]{name:"Oliver Stone"}

Node[5]{title:"Wall Street"}

Rows: 3