Syntax and semantics
This section contains reference material for looking up the syntax and semantics of specific elements of graph pattern matching.
Node patterns
A node pattern is a pattern that matches a single node.
It can be used on its own in a clause such as MATCH
or EXIST
, or form part of a path pattern.
See also node pattern concepts.
Syntax
nodePattern ::= "(" [ nodeVariable ] [ labelExpression ]
[ propertyKeyValueExpression ] [ elementPatternWhereClause] ")"
elementPatternWhereClause ::= "WHERE" booleanExpression
For rules on valid node variable names, see the Cypher® naming rules.
Rules
Predicates
Three types of predicate can be specified inside a node pattern:
The boolean expression of the WHERE
clause can reference any variables within scope of the node pattern.
A node variable needs to be declared in the node pattern in order to reference it in the boolean expression.
If no predicates are specified, then the node pattern matches any node.
Variable binding
If a variable has not been declared elsewhere in the query, it will become bound to nodes when the matching of its containing path pattern is executed. If it has been bound in a previous clause, then no new nodes will be bound to the variable; any previously bound nodes that do not match in the current path pattern will lead to the match being eliminated from the results. See the section on clause composition for more details on the passing of results between clauses.
Examples
Matches all nodes with the label A
and binds them to the variable n
:
(n:A)
Matches all nodes with the label B
and a property departs
with the time
value 11:11
:
(:B { departs: time('11:11') })
Matches all nodes with the property departs
with a value equal to the current time plus 30 minutes:
(n WHERE n.departs > time() + duration('PT30M'))
Relationship patterns
A relationship pattern is a pattern that matches a single relationship. It can only be used with node patterns on either side of it.
A relationship pattern followed immediately by a quantifier is an abbreviated quantified path pattern called a quantified relationship.
See also relationship pattern concepts.
Syntax
relationshipPattern ::= fullPattern | abbreviatedRelationship
fullPattern ::=
"<-[" patternFiller "]-"
| "-[" patternFiller "]->"
| "-[" patternFiller "]-"
abbreviatedRelationship ::= "<--" | "--" | "-->"
patternFiller ::= [ relationshipVariable ] [ typeExpression ]
[ propertyKeyValueExpression ] [ elementPatternWhereClause ]
elementPatternWhereClause ::= "WHERE" booleanExpression
Note that the syntax for type expressions in relationship patterns is the same as for label expressions in node patterns (although unlike node labels, relationships must have exactly one type).
For rules on valid relationship variable names, see the Cypher naming rules.
Rules
Predicates
The following three types of predicate can be specified in the pattern filler of a full relationship pattern (i.e. a pattern with the square brackets):
A fourth type of predicate specifies the directionality of the relationship with respect to the overall path pattern, using the less-than or greater-than symbols to form arrows (<
and >
).
If a relationship pattern has no arrows, it will match relationships of any direction.
The boolean expression of the WHERE
clause can reference any variables within scope of the relationship pattern.
A relationship variable needs to be declared in the pattern in order to reference it in the boolean expression.
If no predicates are specified then the pattern matches all relationships.
Variable binding
If the variable has not been declared elsewhere in the query, it will become bound to relationships when the matching of its containing path pattern is executed. If it has been bound in a previous clause, then no new relationships will be bound to the variable; if any previously bound relationships do not match in the current path pattern, then those matches will be eliminated from the results.
See the chapter on clause composition for more details on the passing of results between clauses.
Examples
Matches all relationships with the type R
and binds them to the variable r
:
()-[r:R]->()
Matches all relationships with type R
and property distance
equal to 100
:
()-[:R {distance: 100}]->()
Matches all relationships where property distance
is between 10
and 100
:
()-[r WHERE 10 < r.distance < 100]->()
Matches all relationships that connect nodes with label A
as their source and nodes with label B
as their target:
(:A)-->(:B)
Matches all relationships that connect nodes with label A
and nodes with label B
, irrespective of their direction:
(:A)--(:B)
Label expressions
The following applies to both the label expressions of node patterns and the type expressions of relationship patterns.
A label expression is a boolean predicate composed from label names and a wildcard symbol using disjunction, conjunction, negation and grouping. A label expression returns true when it matches the set of labels for a node.
Although relationships have a type rather than labels, the syntax for expressions matching a relationship type is identical to that of label expressions.
Syntax
labelExpression ::= ":" labelTerm
labelTerm ::=
labelIdentifier
| labelTerm "&" labelTerm
| labelTerm "|" labelTerm
| "!" labelTerm
| "%"
| "(" labelTerm ")"
For valid label identifiers, see the Cypher naming rules.
Rules
The following table lists the symbols used in label expressions:
Symbol | Description | Precedence |
---|---|---|
|
Wildcard.
Evaluates to |
|
|
Contained expression is evaluated before evaluating the outer expression the group is contained in. |
1 (highest) |
|
Negation |
2 |
|
Conjunction |
3 |
|
Disjunction |
4 (lowest) |
Associativity is left-to-right.
Examples
In the following table, a tick is shown where the label expression matches the node with the labels shown:
Node |
||||||||
Node pattern |
|
|
|
|
|
|
|
|
|
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
✅ |
✅ |
✅ |
✅ |
||||
|
✅ |
✅ |
||||||
|
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
|
✅ |
✅ |
✅ |
✅ |
||||
|
✅ |
✅ |
✅ |
✅ |
||||
|
||||||||
|
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
|
✅ |
|||||||
|
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
||||||||
|
✅ |
✅ |
✅ |
✅ |
||||
|
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
|
✅ |
|||||||
|
✅ |
✅ |
✅ |
As relationships have exactly one type each, this expression will never match a relationship:
-[:A&B]->
Similarly, the following will always match a relationship:
-[:%]->
The use of negation can make the conjunction useful in relationship patterns.
The following matches relationships that have a type that is neither A
nor B
:
-[:!A&!B]->
Property key-value expressions
Syntax
propertyKeyValueExpression ::=
"{" propertyKeyValuePairList "}"
propertyKeyValuePairList ::=
propertyKeyValuePair [ "," propertyKeyValuePair ]
propertyKeyValuePair ::= propertyName ":" valueExpression
Rules
The property key-value expression is treated as a conjunction of equalities on the properties of the element that the containing pattern matches.
For example, the following node pattern:
({ p: valueExp1, q: valueExp2 })
is equivalent to the following node pattern with a WHERE
clause:
(n WHERE n.p = valueExp1 AND n.q = valueExp2)
The value expression can be any expression as listed in the section on expressions, except for path patterns (which will throw a syntax error) and regular expressions (which will be treated as string literals).
An empty property key-value expression matches all elements.
Property key-value expressions can be combined with a WHERE
clause.
Examples
Matches all nodes with property p
= 10
:
({ p: 10 })
Matches all relationships with property p
= 10
and q
equal to date 2023-02-10
:
()-[{ p: 10, q: date('2023-02-10') }]-()
Matches all relationships with its property p
equal to the property p
of its source node:
(s)-[{ p: s.p }]-()
Matches all nodes with property p
= 10
and property q
greater than 100
:
(n { p: 10 } WHERE n.q > 100)
Path patterns
A path pattern is the top level pattern that is matched against paths in a graph.
Syntax
pathVariableDeclaration ::= pathVariable "="
pathPatternExpression ::=
{ parenthesizedPathPatternExpression | pathPatternPhrase }
parenthesizedPathPatternExpression ::=
"("
[ subpathVariableDeclaration ]
pathPatternExpression
[ parenthesizedPathPatternWhereClause ]
")"
subpathVariableDeclaration ::= pathVariable "="
pathPatternPhrase ::= [{ simplePathPattern | quantifiedPathPattern }]+
simplePathPattern ::= nodePattern
[ { relationshipPattern | quantifiedRelationship } nodePattern ]*
parenthesizedPathPatternWhereClause ::= "WHERE" booleanExpression
See the related section for each of the syntax rules:
|
|
|
|
|
|
|
|
|
Rules
The minimum number of elements in the path pattern must be greater than zero. For example, a path pattern that is a quantified path pattern with a quantifier that has a lower bound of zero is not allowed:
((n)-[r]->(m)){0,10}
A path pattern must always begin and end with a node pattern.
(n)-[r]->(m)-[s]-
A path pattern may be composed of a concatenation of simple and quantified path patterns. Two simple path patterns, however, may not be placed next to each other.
(a)<-[s]-(b) (c)-[t]->(d)
When a path pattern is matched to paths in a graph, nodes can be revisited but relationships cannot.
A subpath variable (a path variable declared inside a parenthesized path pattern expression) can only be used if a shortest path mode is also specified.
MATCH SHORTEST 1 (p = (a)-[r]->+(b) WHERE length(p) > 3)
MATCH (:A)-[:S]->(:B) (p = ((a)-[r]->(b))+)
See graph patterns for rules on declaring variables multiple times.
Examples
A single node pattern is allowed as it has at least one element:
(n)
A simple path pattern with more than one element:
(a:A)<-[{p: 30}]-(b)-[t WHERE t.q > 0]->(c:C)
A quantified path pattern can have a lower bound of zero in its quantifier as long as it abuts other patterns that have at least one element:
(:A) ((:X)-[:R]-()){0,10} (:B)
A quantified relationship can also have a lower bound of zero as long as the overall path pattern has at least one element:
(:A)-[:R]->{0,10}(:B)
A concatenation of simple and quantified path patterns:
(a)<-[s]-(b)-[t]->(c) ((n)-[r]->(m)){0,10} (:X)
Referencing non-local node variable in a simple path pattern:
(a)<-[s:X WHERE a.p = s.p]-(b)
Referencing a non-local relationship variable within a quantified path pattern:
(:A) ((a)<-[s:X WHERE a.p = s.p]-(b)){,5}
A variable that was introduced in a previous clause can be referenced as long as that variable was defined outside of a quantified path pattern:
MATCH (n)
MATCH ()-[r WHERE r.q = n.q]-() (()<-[s:X WHERE n.p = s.p]-()){2,3}
Paths matched by the path pattern can be assigned to a variable:
MATCH p = ()-[r WHERE r.q = n.q]-()
Quantified path patterns
A quantified path pattern represents a path pattern repeated a number of times in a given range. It is composed of a path pattern, representing the path section to be repeated, followed by a quantifier, constraining the number of repetitions between a lower bound and an upper bound.
For information about an alternative version of patterns for matching paths of variable length, see variable-length relationships.
Syntax
quantifiedPathPattern ::=
parenthesizedPathPatternExpression quantifier
fixedPath ::= nodePattern [ relationshipPattern nodePattern ]+
Rules
Minimum pattern length
The path pattern being quantified must have a length greater than zero. In other words, it must contain at least one relationship. A single node pattern cannot be quantified.
((x:A)){2,4}
Nesting of quantified path patterns
The nesting of quantified path patterns is not allowed. For example, the following nesting of a quantified relationship in a quantified path pattern is not allowed:
(:A) (()-[:R]->+()){2,3} (:B)
A quantified path pattern that is part of the boolean expression within a quantified path pattern would not count as nested and is permitted.
MATCH ((n:A)-[:R]->({p: 30}) WHERE EXISTS { (n)-->+(:X) }){2,3}
Group variables
Variables introduced inside of a quantified path pattern are said to be exposed as group variables outside of the definition of the pattern. As a group variable, they will be bound to either a list of nodes or a list of relationships. By contrast, variables can be treated as singletons inside the quantified path pattern where they are declared. The difference can be seen in the following query:
MATCH ((x)-[r]->(z WHERE z.p > x.p)){2,3}
RETURN [n in x | n.p] AS x_p
In the boolean expression z.p > x.p
both z
and x
are singletons; in the RETURN
clause, x
is a group variable that can be iterated over like a list.
Note that this means that the WHERE
clause z.p > x.p
above needs to be inside the quantified path pattern.
The following would throw a syntax error because it is treating z
and p
as singletons:
MATCH ((x)-[r]->(z)){2,3} WHERE z.p > x.p
It is possible, however, to position the WHERE
clause outside of the node pattern:
MATCH ((x)-[r]->(z) WHERE z.p > x.p){2,3}
Matching
The mechanics of matching a quantified path pattern against paths is best explained with an example. For the example, the following simple graph will be used:
First, consider the following simple path pattern:
(x:A)-[:R]->(z:B WHERE z.h > 2)
This matches three different paths in the graph above.
The resulting bindings for x
and z
for each match are the following (the captions n1
etc indicate the identity of the nodes in the diagram above):
x | z |
---|---|
|
|
|
|
|
|
If the quantifier {2}
is affixed to the simple path pattern, the result is the following quantified path pattern:
((x:A)-[:R]->(z:B WHERE z.h > 2)){2}
This is equivalent to chaining together two iterations of the pattern, where the rightmost node of the first iteration is merged with the leftmost node of the second one. (See node pattern pairs for more details.)
(x:A)-[:R]->(z:B WHERE z.h > 2) (x:A)-[:R]->(z:B WHERE z.h > 2)
To avoid introducing equijoins between the two instances of x
, and between the two instances of z
, the variables are replaced with a set of fresh variables inside each iteration:
(x1:A)-[:R]->(z1:B WHERE z1.h > 2) (x2:A)-[:R]->(z2:B WHERE z2.h > 2)
Then the node variables in adjoining node patterns are merged:
(x1:A)-[:R]->({z1,x2}:A&B WHERE z1.h > 2)-[:R]->(z2:B WHERE z2.h > 2)
The fact that variables x2
and z1
are bound to matches of the same node pattern is represented with the notation {z1,x2}
.
Outside of the pattern, the variables x
and z
will be group variables that contain lists of nodes.
Consider the quantified path pattern in the following query:
MATCH ((x:A)-[:R]->(z:B WHERE z.h > 2)){2}
RETURN [n in x | n.h] AS x_h, [n in z | n.h] AS z_h
This yields the following results:
x_h | z_h |
---|---|
|
|
|
|
Now the quantifier is changed to match lengths from one to five:
MATCH ((x:A)-[:R]->(z:B WHERE z.h > 2)){1,5}
RETURN [n in x | n.h] AS x_h, [n in z | n.h] AS z_h
Compared to the fixed-length quantifier {2}
, this also matches paths of length one and three, but no matches exist for length greater than three:
x_h | z_h |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Quantified relationships
Rules
A quantified relationship is an abbreviated form of a quantified path pattern, with only a single relationship pattern specified.
For example, the following quantified relationship:
(:A)-[r:R]->{m,n}(:B)
It matches paths with between m
and n
relationships of type R
pointing left-to-right, starting at a node of type A
, ending at a node of type B
, and matching any nodes in-between.
It is equivalent to the following quantified path pattern:
(:A) (()-[r:R]->()){m,n} (:B)
With the expanded form of a quantified path pattern, it is possible to add predicates inside. Conversely, if the only predicate is a relationship type expression, query syntax can be more concise using a quantified relationship.
Note that, unlike a quantified path pattern, a quantified relationship must always have a node pattern on each side.
Examples
Matches one or more relationships with type R
and of any direction, and any nodes connecting those relationships:
()-[:R]-+()
Matches paths consisting of two inbound subpaths, one with relationships of type R
and one with relationships of type S
, meeting at a node with label A
:
()-[:R]->+(:A)<-[:S]-+()
Matches paths with any nodes and with one or more relationships of any direction and any type:
()--+()
Matches paths starting with nodes labeled A
and ending with nodes labeled B
, that traverse between two and three relationships of type R
, matching any intermediate nodes:
(:A)-[r:R]->{2,3}(:B)
Quantifiers
The quantifiers here only refer to those used in quantified path patterns and quantified relationships.
Syntax
quantifier ::=
"*" | "+" | fixedQuantifier | generalQuantifier
fixedQuantifier ::= "{" unsignedDecimalInteger "}"
generalQuantifier ::= "{" lowerBound "," upperBound "}"
lowerBound ::= unsignedDecimalInteger
upperBound ::= unsignedDecimalInteger
unsignedDecimalInteger ::= [0-9]+
The |
Rules
The absence of an upper bound in the general quantifier syntax means there is no upper bound. The following table shows variants of the quantifier syntax and their canonical form:
Variant | Canonical form | Description |
---|---|---|
|
|
Between m and n iterations. |
|
|
1 or more iterations. |
|
|
0 or more iterations. |
|
|
Exactly n iterations. |
|
|
m or more iterations. |
|
|
Between 0 and n iterations. |
|
|
0 or more iterations. |
Note that a quantified path pattern with the quantifier {1}
is not equivalent to a fixed-length path pattern.
Although the resulting quantified path pattern will match on the same paths the fixed-length path contained in it would without the quantifier, the presence of the quantifier means that all variables within the path pattern will be exposed as group variables.
Variable-length relationships
Prior to the introduction of the syntax for quantified path patterns and quantified relationships in Neo4j 5.9, the only way in Cypher to match paths of variable length was with a variable-length relationship. This syntax is still available, but it is not GQL conformant. It is equivalent to the syntax for quantified relationships, with the following differences:
-
Position and syntax of quantifier.
-
Semantics of the asterisk symbol.
-
Type expressions are limited to the disjunction operator.
-
The WHERE clause is not allowed.
Syntax
varLengthRelationship ::=
"<-[" varLengthFiller "]-"
| "-[" varLengthFiller "]->"
| "-[" varLengthFiller "]-"
varLengthFiller ::= [ relationshipVariable ] [ varLengthTypeExpression ]
[ varLengthQuantifier ] [ propertyKeyValueExpression ]
varLengthTypeExpression ::= ":" varLengthTypeTerm
varLengthTypeTerm ::=
typeIdentifier
| varLengthTypeTerm "|" varLengthTypeTerm
varLengthQuantifier ::= varLengthVariable | varLengthFixed
varLengthVariable ::= "*" [ [ lowerBound ] ".." [ upperBound ] ]
varLengthFixed ::= "*" fixedBound
fixedBound ::= unsignedDecimalInteger
lowerBound ::= unsignedDecimalInteger
upperBound ::= unsignedDecimalInteger
unsignedDecimalInteger ::= [0-9]+
The |
For rules on valid relationship variable names, see Cypher naming rules.
Rules
The following table shows variants of the variable-length quantifier syntax and their equivalent quantifier form (the form used by quantified path patterns):
Variant | Description | Quantified relationship equivalent | Quantified path pattern equivalent |
---|---|---|---|
|
1 or more iterations. |
|
|
|
Exactly n iterations. |
|
|
|
Between m and n iterations. |
|
|
|
m or more iterations. |
|
|
|
0 or more iterations. |
|
|
|
Between 1 and n iterations. |
|
|
|
Between 0 and n iterations. |
|
|
Note that *
used here on its own is not the same as the Kleene star (an operator that represents zero or more repetitions), as the Kleene star has a lower bound of zero.
The above table can be used to translate the quantifier used in variable-length relationships.
The rules given for quantified path patterns would apply to the translation.
This table shows some examples:
Variable-length relationship | Equivalent quantified path pattern |
---|---|
|
|
|
|
|
|
Equijoins
The variable of a variable-length relationship can be used in subsequent patterns to refer to the list of relationships the variable is bound to. This is the same as the equijoin for variables bound to single nodes or relationships.
This section uses the following graph:
To recreate the graph, run the following query against an empty Neo4j database:
CREATE ({name: 'Filipa'})-[:KNOWS]->({name:'Anders'})-[:KNOWS]->
({name:'Dilshad'})
In the following query, the node variables will be bound to the same nodes:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
MATCH (c)<-[r*1..2]-(d)
RETURN a = c, b = d, size(r)
a = c | b = d | size(r) |
---|---|---|
|
|
|
|
|
|
Rows: 2 |
The list of relationships keeps its order.
This means that in the following query, where the direction of the variable-length relationship in the second MATCH
is switched, the equijoin will only match once, when there is a single relationship:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
MATCH (c)-[r*1..2]->(d)
RETURN a = c, b = d, size(r)
a = c | b = d | size(r) |
---|---|---|
|
|
|
Rows: 1 |
The variable r
can be reversed in order like any list, and made to match the switch in relationship pattern direction:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
WITH a, b, reverse(r) AS s
MATCH (c)-[s*1..2]->(d)
RETURN a = d, b = c, size(s)
a = d | b = c | size() |
---|---|---|
|
|
|
|
|
|
Rows: 2 |
Changing the bounds on subsequent MATCH
statements will mean that only the overlapping lengths of the quantifier bounds will produce results:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
MATCH (c)<-[r*2..3]-(d)
RETURN a = c, b = d, size(r)
a = c | b = d | size(r) |
---|---|---|
|
|
|
Rows: 1 |
Because Cypher only allows paths to traverse a relationship once (see relationship uniqueness), repeating a variable-length relationship in the same graph pattern will yield no results.
For example, this MATCH
clause will never pass on any intermediate results to subsequent clauses:
MATCH (x)-[r*1..2]->(y)-[r*1..2]->(z)
Attempting to repeat a variable-length relationship in a single relationship pattern will raise an error.
For example, the following pattern raises an error because the variable r
appears in both a variable-length relationship and a fixed-length relationship:
MATCH (x)-[r*1..2]->(y)-[r]->(z)
Examples
The following pattern matches paths starting with nodes labeled A
and ending with nodes labeled B
, that traverse between two and three relationships of type R
:
(:A)-[r:R*2..3]->(:B)
The following traverses relationships of type R
or S
or T
exactly five times:
()-[r:R|S|T*5]->()
The following traverses any relationship between zero and five times, with the path beginning at nodes labeled A
and ending at nodes labeled B
.
Note that this will also return all nodes that have both labels A
and B
for the case where zero relationships are traversed:
(:A)-[*0..5]->(:B)
If the lower bound is removed, it will default to one, and will no longer return paths of length zero, i.e. single nodes:
(:A)-[*..5]->(:B)
The following pattern traverses one or more relationships of any direction that have property p
= $param
:
()-[* {p: $param}]-()
Shortest paths
Path selectors are modes that are specified at the path pattern level. There are five types of path selectors:
-
SHORTEST k
-
ALL SHORTEST
-
SHORTEST k GROUPS
-
ANY k
-
ALL
The first three are shortest path selectors, and they specify which shortest paths should be returned, with the exact method of selection depending on the path selector type and the value of k
.
See path patterns for details on where in the path pattern to include the path selector.
Syntax
shortestPathSelector ::=
{ allPathSearch | anyPathSearch | shortestPathSearch }
shortestPathSearch ::=
{ allShortest | anyShortest | countedShortestPaths |
countedShortestGroups }
allShortest ::= "ALL SHORTEST" [ pathOrPaths ]
anyShortest ::= "ANY SHORTEST" [ pathOrPaths ]
countedShortestPaths ::=
"SHORTEST" numberOfPaths [ pathOrPaths ]
countedShortestGroups ::=
"SHORTEST" [ numberOfGroups ] [ pathOrPaths ] { "GROUP" | "GROUPS" }
allPathSearch ::= "ALL" [ pathOrPaths ]
anyPathSearch ::= "ANY" [ numberOfPaths ] [ pathOrPaths ]
pathOrPaths ::= { "PATH" | "PATHS" }
numberOfPaths ::= unsignedDecimalInteger
numberOfGroups ::= unsignedDecimalInteger
The |
Rules
Selective path selectors
The following path selector types are selective path selectors:
-
SHORTEST k
-
SHORTEST k GROUPS
-
ALL SHORTEST
-
ANY k
This means that they reduce the number of matches produced by the path pattern.
ALL
returns all matches produced, and so is not selective.
Order of path selection
When a path selector is specified, the following steps are followed to find solutions to the path pattern:
-
All paths matching the path pattern are found.
-
Any paths not satisfying the predicates contained in the path pattern are removed. This is a pre-filter.
-
Paths are selected according to the path selector specified.
-
Any paths not satisfying the predicates in the graph pattern
WHERE
clause are removed. This is a post-filter.
// node pattern WHERE clause
MATCH p = SHORTEST 2 (a WHERE a.p < 42)--+()
// relationship pattern property key-value expression
MATCH p = SHORTEST 2 (a)-[{p: 42}]-+()
// label and type expressions
MATCH p = SHORTEST 2 (:A)-[:R]-+(:B)
// quantified path pattern
MATCH SHORTEST 2 ((a)--(b) WHERE a.p < b.p)+
// parenthesized path pattern expression
// note the position of the parentheses!
MATCH SHORTEST 2 ( p = ()--+() WHERE any(n IN nodes(p) WHERE n.p < 42) )
// graph pattern WHERE clause
MATCH p = SHORTEST 2 (a)--+()
WHERE all(n IN nodes(p) WHERE n.p < 42)
Partitions
If there are multiple start and end nodes matching a path pattern, then, when a selective path selector is specified, the paths matched by the path pattern are partitioned by distinct pairs of start and end nodes. Within each partition, they are put in ascending order of path length. The partitioned paths are then selected according to the specified path selector as follows:
Selective path selector | Partitioning |
---|---|
|
First
|
|
The paths are further grouped by path length, and those groups are put in ascending order by path length.
Then the paths from the first |
|
All paths tied for the shortest.
Equivalent to |
|
Any |
|
All paths are returned. This is the same as not specifying any path selector. |
For a selective path selector, if k
is greater than N
, the number of paths matching the path pattern, then all N
paths are returned.
Only one path pattern allowed
When a selective shortest path selector is specified for a path pattern, it must be the only path pattern in the graph pattern.
MATCH p = SHORTEST 2 (:A)--+(a)--+(:B), q = ANY 2 (a)-->{,2}(:C)
RETURN p, q
MATCH ALL SHORTEST (n:A) (()-->(:B))+, (:X)--(n)--(:Y)
RETURN n
MATCH p = SHORTEST 2 (:A)--+(a)--+(:B)
MATCH q = ANY 2 (a)-->{,2}(:C)
RETURN p, q
MATCH ALL SHORTEST (n:A) (()-->(:B))+
MATCH (:X)--(n)--(:Y)
RETURN n
MATCH ALL (n:A) (()-->(:B))+, (:X)--(n)--(:Y)
RETURN n
PATH
/PATHS
The keywords PATH
and PATHS
are optional and can be added to the end of the path selector (but before GROUP
or GROUPS
).
Including either of them does not change path selection.
For example, the following two MATCH
clauses are equivalent:
MATCH ALL SHORTEST PATHS (n:A) (()-->(:B))+
MATCH ALL SHORTEST (n:A) (()-->(:B))+
Examples
Return a single shortest path for each distinct pair of nodes matching (:A)
and (:B)
:
MATCH SHORTEST 1 (:A)-[:R]->{0,10}(:B)
Return any two paths connecting each distinct pair of nodes matching (:A)
and (:B)
:
MATCH p = ANY 2 (:A)-[:R]->{0,10}(:B)
Return all paths equal to the shortest path length for each distinct pair of nodes matching (:A)
and (:B)
:
MATCH ALL SHORTEST (:A)-[:R]->{0,10}(:B)
Return all paths equal to the two shortest path lengths for each distinct pair of nodes matching (:A)
and (:B)
:
MATCH SHORTEST 2 GROUPS (:A)-[:R]->{0,10}(:B)
Return a single shortest path for each distinct pair of nodes, where the path length is an even number:
MATCH SHORTEST 1 (p = ()--+() WHERE length(p) % 2 = 0)
For every single shortest path connecting each distinct pair of nodes, only return those that have path length that is an even number (so fewer results than the previous example):
MATCH p = SHORTEST 2 ()--+()
WHERE length(p) % 2 = 0
The shortestPath()
and allShortestPaths()
functions
Prior to the introduction of keyword-based specification of shortest path selection in Neo4j 5.21, the only available syntax for shortest paths were the functions shortestPath()
and allShortestPaths()
.
They are similar to SHORTEST 1
and ALL SHORTEST
, but with several differences:
-
The path pattern is passed as an argument to the functions.
-
The path pattern is limited to a single relationship pattern.
-
To return results where the first and last node in the path are the same requires a change to the configuration setting
dbms.cypher.forbid_shortestpath_common_nodes
. -
The minimum path length, also called the lower bound of the variable-length relationship pattern, should be 0 or 1.
Both functions will continue to be available, but they are not GQL conformant.
Syntax
pathSelectorFunction ::=
{ shortestPathFunction | allShortestPathsFunction }
shortestPathFunction ::=
"shortestPath(" + oneHopPathPatternExpression + ")"
allShortestPathsFunction ::=
"allShortestPaths(" + oneRelPathPatternExpression + ")"
oneRelPathPatternExpression ::=
nodePattern varLengthRelationship nodePattern
Note that it is possible to pass a fixed-length path pattern (with a single relationship) to the path selector function, but doing so would not serve any purpose in discovering a shortest path.
Rules
Restricted to variable-length relationships
The pattern in the path selector function must be a variable-length relationship and not a quantified path pattern.
shortestPath(((a)-[:R]-(b)){1,5})
shortestPath((:A)-->+(:B))
Path pattern length
There must be exactly one relationship pattern in the path pattern, and the lower bound should be 0 or 1.
shortestPath((a)-[:R*1..5]-(b))
shortestPath((a)-[:R*1..5]-(b)-->(:X))
shortestPath((a)-[:R*2..5]-(b))
shortestPath((:A))
allShortestPaths((a:A)-[:S*]->(:B), (a)-[:R*1..3]->(:C))
Pre and post filtering
If the MATCH
clause of the shortestPath()
function includes a WHERE
clause, this condition will act as a pre-filter: paths satisfying the WHERE
clause are first found, and from those paths the shortest path is selected.
MATCH p = shortestPath(()-[*]-())
WHERE all(n in nodes(p) WHERE n.p < 42)
MATCH p = shortestPath(()-[*]-())
WHERE all(r in relationships(p) WHERE r.p < 42)
These contrast with WHERE
clauses that are not part of the same MATCH
clause.
MATCH p = shortestPath(()-[*]-())
WITH nodes(p) AS N
WHERE all(n in N WHERE n.p < 42)
Graph patterns
A graph pattern is a comma separated list of one or more path patterns.
It is the top level construct provided to MATCH
.
Syntax
graphPattern ::=
pathPattern [ "," pathPattern ]* [ graphPatternWhereClause ]
graphPatternWhereClause ::= "WHERE" booleanExpression
Rules
The rules for path patterns apply to each constituent path pattern of a graph pattern.
Variable references
If a variable is declared inside a quantified path pattern, then it can be treated as a singleton only from within the quantified path pattern it was declared in. Outside of that quantified path pattern, it must be treated as a group variable.
((n)-[r]->(m WHERE r.p = m.q))+
(n)-[r]->+(m WHERE all(rel in r WHERE rel.q > m.q))
(n)-[r]->+(m WHERE r.p = m.q)
Relationship uniqueness
A relationship can only be traversed once in a given match for a graph pattern. The same restriction doesn’t hold for nodes, which may be re-traversed any number of times in a match.
Equijoin
If a node variable is declared more than once in a path pattern, it is expressing an equijoin.
This is an operation that requires that each node pattern with the same node variable be bound to the same node.
For example, the following pattern refers to the same node twice with the variable a
, forming a cycle:
(a)-->(b)-->(c)-->(a)
The following pattern refers to the same node with variable b
in different path patterns of the same graph pattern, forming a "T" shaped pattern:
(a)-->(b)-->(c), (b)-->(e)
Equijoins can only be made using variables outside of quantified path patterns. The following would not be a valid equijoin:
(a)-->(b)-->(c), ((b)-->(e))+ (:X)
If no equijoin exists between path patterns in a graph pattern, then a Cartesian join is formed from the sets of matches for each path pattern. An equijoin can be expressed between relationship patterns by declaring a relationship variable multiple times. However, as relationships can only be traversed once in a given match, no solutions would be returned.
Examples
The WHERE
clause can refer to variables inside and outside of quantified path patterns:
(a)-->(b)-->(c), (b) ((d)-->(e))+ WHERE any(n in d WHERE n.p = a.p)
An equijoin can be formed to match "H" shaped graphs:
(:A)-->(x)--(:B), (x)-[:R]->+(y), (:C)-->(y)-->(:D)
With no variables in common, this graph pattern will result in a Cartesian join between the sets of matches for the two path patterns:
(a)-->(b)-->(c), ((d)-->(e))+
Multiple equijoins can be formed between path patterns:
(:X)-->(a:A)-[!:R]->+(b:B)-->(:Y), (a)-[:R]->+(b)
Variables declared in a previous MATCH
can be referenced inside of a quantified path pattern:
MATCH (n {p = 'ABC'})
MATCH (n)-->(m:A)-->(:B), (m) (()-[r WHERE r.p <> n.p]->())+ (:C)
The repetition of a relationship variable in the following yields no solutions due to Cypher enforcing relationship uniqueness within a match for a graph pattern:
MATCH ()-[r]->()-->(), ()-[r]-()
Node pattern pairs
It is not valid syntax to write a pair of node patterns next to each other. For example, all of the following would raise a syntax error:
(a:A)(b:B)
(a:A)(b:B)<-[r:R]-(c:C)
(a:A)<--(b:B)(c:C)-->(d:C)
However, the placing of pairs of node patterns next to each other is valid where it results indirectly from the expansion of quantified path patterns.
Iterations of quantified path patterns
When a quantified path pattern is expanded, the fixed path pattern contained in its parentheses is repeated and chained. This results in pairs of node patterns sitting next to each other. Take the following quantified path pattern as an example:
((x:X)<--(y:Y)){3}
This is expanded by repeating the fixed path pattern (x:X)←-(y:Y)
three times, with indices on the variables to show that no equijoin is implied (see equijoins for more information):
(x1:X)<--(y1:Y)(x2:X)<--(y2:Y)(x3:X)<--(y3:Y)
The result is that two pairs of node patterns end up adjoining each other, (y1:Y)(x2:X)
and (y2:Y)(x3:X)
.
During the matching process, each pair of node patterns will match the same nodes, and those nodes will satisfy the conjunction of the predicates in the node patterns.
For example, in the first pair both y1
and x2
will bind to the same node, and that node must have labels X
and Y
.
This expansion and binding is depicted in the following diagram:
Simple path patterns and quantified path patterns
Pairs of node patterns are also generated when a simple path pattern is placed next to a quantified path. For example, consider the following path pattern:
(:A)-[:R]->(:B) ((:X)<--(:Y)){1,2}
After expanding the iterations of the quantified path pattern, the right-hand node pattern (:B)
adjoins the left-hand node pattern (:X)
.
The result will match the same paths as the union of matches of the following two path patterns:
(:A)-[:R]->(:B&X)<--(:Y)
(:A)-[:R]->(:B&X)<--(:Y&X)<--(:Y)
If the simple path pattern is on the right of the quantified path pattern, its leftmost node (:A)
adjoins the rightmost node (:Y)
of the last iteration of the quantified path pattern.
For example, the following:
((:X)<--(:Y)){1,2} (:A)-[:R]->(:B)
will match the same paths as the union of the following two path patterns:
(:X)<--(:Y&A)-[:R]->(:B)
(:X)<--(:Y&X)<--(:Y&A)-[:R]->(:B)
Pairs of quantified path patterns
When two quantified path patterns adjoin, the rightmost node of the last iteration of the first pattern is merged with the leftmost node of the first iteration of the second pattern. For example, the following adjoining patterns:
((:A)-[:R]->(:B)){2} ((:X)<--(:Y)){1,2}
will match the same set of paths as the union of the paths matched by these two path patterns:
(:A)-[:R]->(:B&A)-[:R]->(:B&X)<--(:Y)
(:A)-[:R]->(:B&A)-[:R]->(:B&X)<--(:Y&X)<--(:Y)
Zero iterations
If the quantifier allows for zero iterations of a pattern, for example {0,3}
, then the 0th iteration of that pattern results in the node patterns on either side pairing up.
For example, the following path pattern:
(:X) ((a:A)-[:R]->(b:B)){0,1} (:Y)
will match the same set of paths as the union of the paths matched by the following two path patterns:
(:X&Y)
(:X&A)-[:R]->(:B&Y)