MATCH
The
MATCH
clause is used to search for the pattern described in it.
Introduction
The MATCH
clause allows you to specify the patterns Neo4j will search for in the database.
This is the primary way of getting data into the current set of bindings.
It is worth reading up more on the specification of the patterns themselves in Patterns.
MATCH
is often coupled to a WHERE
part which adds restrictions, or predicates, to the MATCH
patterns, making them more specific.
The predicates are part of the pattern description, and should not be considered a filter applied only after the matching is done.
This means that WHERE
should always be put together with the MATCH
clause it belongs to.
MATCH
can occur at the beginning of the query or later, possibly after a WITH
.
If it is the first clause, nothing will have been bound yet, and Neo4j will design a search to find the results matching the clause and any associated predicates specified in any WHERE
part.
This could involve a scan of the database, a search for nodes having a certain label, or a search of an index to find starting points for the pattern matching.
Nodes and relationships found by this search are available as bound pattern elements, and can be used for pattern matching of paths.
They can also be used in any further MATCH
clauses, where Neo4j will use the known elements, and from there find further unknown elements.
Cypher® is declarative, and so usually the query itself does not specify the algorithm to use to perform the search.
Neo4j will automatically work out the best approach to finding start nodes and matching patterns.
Predicates in WHERE
parts can be evaluated before pattern matching, during pattern matching, or after finding matches.
However, there are cases where you can influence the decisions taken by the query compiler.
Read more about indexes in Indexes for search performance, and more about specifying hints to force Neo4j to solve a query in a specific way in Planner hints and the USING keyword.
To understand more about the patterns used in the |
The following graph is used for the examples below:
Basic node finding
Get all nodes
By just specifying a pattern with a single node and no labels, all nodes in the graph will be returned.
MATCH (n)
RETURN n
Returns all the nodes in the database.
n |
---|
|
|
|
|
|
|
|
Rows: 7 |
Get all nodes with a label
Getting all nodes with a label on them is done with a single node pattern where the node has a label on it.
MATCH (movie:Movie)
RETURN movie.title
Returns all the movies in the database.
movie.title |
---|
|
|
Rows: 2 |
Related nodes
The symbol --
means related to, without regard to type or direction of the relationship.
MATCH (director {name: 'Oliver Stone'})--(movie)
RETURN movie.title
Returns all the movies directed by 'Oliver Stone'.
movie.title |
---|
|
Rows: 1 |
Match with labels
To constrain your pattern with labels on nodes, you add it to your pattern nodes, using the label syntax.
MATCH (:Person {name: 'Oliver Stone'})--(movie:Movie)
RETURN movie.title
Returns any nodes connected with the Person
'Oliver' that are labeled Movie
.
movie.title |
---|
|
Rows: 1 |
Relationship basics
Outgoing relationships
When the direction of a relationship is of interest, it is shown by using -->
or <--
, like this:
MATCH (:Person {name: 'Oliver Stone'})-->(movie)
RETURN movie.title
Returns any nodes connected with the Person
'Oliver' by an outgoing relationship.
movie.title |
---|
|
Rows: 1 |
Directed relationships and variable
If a variable is required, either for filtering on properties of the relationship, or to return the relationship, this is how you introduce the variable.
MATCH (:Person {name: 'Oliver Stone'})-[r]->(movie)
RETURN type(r)
Returns the type of each outgoing relationship from 'Oliver'.
type(r) |
---|
|
Rows: 1 |
Match on relationship type
When you know the relationship type you want to match on, you can specify it by using a colon together with the relationship type.
MATCH (wallstreet:Movie {title: 'Wall Street'})<-[:ACTED_IN]-(actor)
RETURN actor.name
Returns all actors that ACTED_IN
'Wall Street'.
actor.name |
---|
|
|
|
Rows: 3 |
Match on multiple relationship types
To match on one of multiple types, you can specify this by chaining them together with the pipe symbol |
.
MATCH (wallstreet {title: 'Wall Street'})<-[:ACTED_IN|DIRECTED]-(person)
RETURN person.name
Returns nodes with an ACTED_IN
or DIRECTED
relationship to 'Wall Street'.
person.name |
---|
|
|
|
|
Rows: 4 |
Match on relationship type and use a variable
If you both want to introduce an variable to hold the relationship, and specify the relationship type you want, just add them both, like this:
MATCH (wallstreet {title: 'Wall Street'})<-[r:ACTED_IN]-(actor)
RETURN r.role
Returns ACTED_IN
roles for 'Wall Street'.
r.role |
---|
|
|
|
Rows: 3 |
Relationships in depth
Inside a single pattern, relationships will only be matched once. |
Relationship types with uncommon characters
Sometimes your database will have types with non-letter characters, or with spaces in them.
Use `
(backtick) to quote these.
To demonstrate this we can add an additional relationship between 'Charlie Sheen' and 'Rob Reiner':
MATCH
(charlie:Person {name: 'Charlie Sheen'}),
(rob:Person {name: 'Rob Reiner'})
CREATE (rob)-[:`TYPE INCLUDING A SPACE`]->(charlie)
Which leads to the following graph:
MATCH (n {name: 'Rob Reiner'})-[r:`TYPE INCLUDING A SPACE`]->()
RETURN type(r)
Returns a relationship type with spaces in it.
type(r) |
---|
|
Rows: 1 |
Multiple relationships
Relationships can be expressed by using multiple statements in the form of ()--()
, or they can be strung together, like this:
MATCH (charlie {name: 'Charlie Sheen'})-[:ACTED_IN]->(movie)<-[:DIRECTED]-(director)
RETURN movie.title, director.name
Returns the movie 'Charlie Sheen' acted in and its director.
movie.title | director.name |
---|---|
|
|
Rows: 1 |
Variable length relationships
Nodes that are a variable number of relationship->node
hops away can be found using the following syntax:
-[:TYPE*minHops..maxHops]->
.
minHops
and maxHops
are optional and default to 1 and infinity respectively.
When no bounds are given the dots may be omitted.
The dots may also be omitted when setting only one bound and this implies a fixed length pattern.
MATCH (charlie {name: 'Charlie Sheen'})-[:ACTED_IN*1..3]-(movie:Movie)
RETURN movie.title
Returns all movies related to 'Charlie Sheen' by 1 to 3 hops.
movie.title |
---|
|
|
|
Rows: 3 |
Variable length relationships with multiple relationship types
Variable length relationships can be combined with multiple relationship types. In this case the *minHops..maxHops
applies to all relationship types as well as any combination of them.
MATCH (charlie {name: 'Charlie Sheen'})-[:ACTED_IN|DIRECTED*2]-(person:Person)
RETURN person.name
Returns all people related to 'Charlie Sheen' by 2 hops with any combination of the relationship types ACTED_IN
and DIRECTED
.
person.name |
---|
|
|
|
Rows: 3 |
Relationship variable in variable length relationships
When the connection between two nodes is of variable length, the list of relationships comprising the connection can be returned using the following syntax:
MATCH p = (actor {name: 'Charlie Sheen'})-[:ACTED_IN*2]-(co_actor)
RETURN relationships(p)
Returns a list of relationships.
relationships(p) |
---|
|
|
Rows: 2 |
Match with properties on a variable length path
A variable length relationship with properties defined on in it means that all relationships in the path must have the property set to the given value.
In this query, there are two paths between 'Charlie Sheen' and his father 'Martin Sheen'.
One of them includes a 'blocked' relationship and the other does not.
In this case we first alter the original graph by using the following query to add BLOCKED
and UNBLOCKED
relationships:
MATCH
(charlie:Person {name: 'Charlie Sheen'}),
(martin:Person {name: 'Martin Sheen'})
CREATE (charlie)-[:X {blocked: false}]->(:UNBLOCKED)<-[:X {blocked: false}]-(martin)
CREATE (charlie)-[:X {blocked: true}]->(:BLOCKED)<-[:X {blocked: false}]-(martin)
This means that we are starting out with the following graph:
MATCH p = (charlie:Person)-[* {blocked:false}]-(martin:Person)
WHERE charlie.name = 'Charlie Sheen' AND martin.name = 'Martin Sheen'
RETURN p
Returns the paths between 'Charlie Sheen' and 'Martin Sheen' where all relationships have the blocked
property set to false
.
p |
---|
|
Rows: 1 |
Zero length paths
Using variable length paths that have the lower bound zero means that two variables can point to the same node. If the path length between two nodes is zero, they are by definition the same node. Note that when matching zero length paths the result may contain a match even when matching on a relationship type not in use.
MATCH (wallstreet:Movie {title: 'Wall Street'})-[*0..1]-(x)
RETURN x
Returns the movie itself as well as actors and directors one relationship away
x |
---|
|
|
|
|
|
Rows: 5 |
Named paths
If you want to return or filter on a path in your pattern graph, you can a introduce a named path.
MATCH p = (michael {name: 'Michael Douglas'})-->()
RETURN p
Returns the two paths starting from 'Michael Douglas'
p |
---|
|
|
Rows: 2 |
Matching on a bound relationship
When your pattern contains a bound relationship, and that relationship pattern does not specify direction, Cypher will try to match the relationship in both directions.
MATCH (a)-[r]-(b)
WHERE id(r) = 0
RETURN a, b
This returns the two connected nodes, once as the start node, and once as the end node
a | b |
---|---|
|
|
|
|
Rows: 2 |
Shortest path
Single shortest path
Finding a single shortest path between two nodes is as easy as using the shortestPath
function. It is done like this:
MATCH
(martin:Person {name: 'Martin Sheen'}),
(oliver:Person {name: 'Oliver Stone'}),
p = shortestPath((martin)-[*..15]-(oliver))
RETURN p
This means: find a single shortest path between two nodes, as long as the path is max 15 relationships long.
Within the parentheses you define a single link of a path — the starting node, the connecting relationship and the end node.
Characteristics describing the relationship like relationship type, max hops and direction are all used when finding the shortest path.
If there is a WHERE
clause following the match of a shortestPath
, relevant predicates will be included in the shortestPath
.
If the predicate is a none()
or all()
on the relationship elements of the path, it will be used during the search to improve performance (see Shortest path planning).
p |
---|
|
Rows: 1 |
Single shortest path with predicates
Predicates used in the WHERE
clause that apply to the shortest path pattern are evaluated before deciding what the shortest matching path is.
MATCH
(charlie:Person {name: 'Charlie Sheen'}),
(martin:Person {name: 'Martin Sheen'}),
p = shortestPath((charlie)-[*]-(martin))
WHERE none(r IN relationships(p) WHERE type(r) = 'FATHER')
RETURN p
This query will find the shortest path between 'Charlie Sheen' and 'Martin Sheen', and the WHERE
predicate will ensure that we do not consider the father/son relationship between the two.
p |
---|
|
Rows: 1 |
All shortest paths
Finds all the shortest paths between two nodes.
MATCH
(martin:Person {name: 'Martin Sheen'} ),
(michael:Person {name: 'Michael Douglas'}),
p = allShortestPaths((martin)-[*]-(michael))
RETURN p
Finds the two shortest paths between 'Martin Sheen' and 'Michael Douglas'.
p |
---|
|
|
Rows: 2 |
Get node or relationship by ID
Node by ID
Searching for nodes by ID can be done with the id()
function in a predicate.
Neo4j reuses its internal IDs when nodes and relationships are deleted. This means that applications using, and relying on internal Neo4j IDs, are brittle or at risk of making mistakes. It is therefore recommended to rather use application-generated IDs. |
MATCH (n)
WHERE id(n) = 0
RETURN n
The corresponding node is returned.
n |
---|
|
Rows: 1 |
Relationship by ID
Search for relationships by ID can be done with the id()
function in a predicate.
This is not the recommended practice. See Node by ID for more information on the use of Neo4j IDs.
MATCH ()-[r]->()
WHERE id(r) = 0
RETURN r
The relationship with ID 0
is returned.
r |
---|
|
Rows: 1 |
Multiple nodes by ID
Multiple nodes are selected by specifying them in an IN
-clause.
MATCH (n)
WHERE id(n) IN [0, 3, 5]
RETURN n
This returns the nodes listed in the IN
-expression.
n |
---|
|
|
|
Rows: 3 |