Variable length paths
Quantified path patterns
A quantified path pattern represents a path pattern repeated a number of times in a given range. It is composed of a path pattern, representing the path section to be repeated, followed by a quantifier, constraining the number of repetitions between a lower bound and an upper bound.
For information about an alternative version of patterns for matching paths of variable length, see variable-length relationships.
Syntax
quantifiedPathPattern ::=
parenthesizedPathPatternExpression quantifier
fixedPath ::= nodePattern [ relationshipPattern nodePattern ]+
Rules
Minimum pattern length
The path pattern being quantified must have a length greater than zero. In other words, it must contain at least one relationship. A single node pattern cannot be quantified.
((x:A)){2,4}
Nesting of quantified path patterns
The nesting of quantified path patterns is not allowed. For example, the following nesting of a quantified relationship in a quantified path pattern is not allowed:
(:A) (()-[:R]->+()){2,3} (:B)
A quantified path pattern that is part of the boolean expression within a quantified path pattern would not count as nested and is permitted.
MATCH ((n:A)-[:R]->({p: 30}) WHERE EXISTS { (n)-->+(:X) }){2,3}
Group variables
Variables introduced inside of a quantified path pattern are said to be exposed as group variables outside of the definition of the pattern. As a group variable, they will be bound to either a list of nodes or a list of relationships. By contrast, variables can be treated as singletons inside the quantified path pattern where they are declared. The difference can be seen in the following query:
MATCH ((x)-[r]->(z WHERE z.p > x.p)){2,3}
RETURN [n in x | n.p] AS x_p
In the boolean expression z.p > x.p both z and x are singletons; in the RETURN clause, x is a group variable that can be iterated over like a list.
Note that this means that the WHERE clause z.p > x.p above needs to be inside the quantified path pattern.
The following would throw a syntax error because it is treating z and p as singletons:
MATCH ((x)-[r]->(z)){2,3} WHERE z.p > x.p
It is possible, however, to position the WHERE clause outside of the node pattern:
MATCH ((x)-[r]->(z) WHERE z.p > x.p){2,3}
Matching
The mechanics of matching a quantified path pattern against paths is best explained with an example. For the example, the following simple graph will be used:
First, consider the following simple path pattern:
(x:A)-[:R]->(z:B WHERE z.h > 2)
This matches three different paths in the graph above.
The resulting bindings for x and z for each match are the following (the captions n1 etc indicate the identity of the nodes in the diagram above):
| x | z |
|---|---|
|
|
|
|
|
|
If the quantifier {2} is affixed to the simple path pattern, the result is the following quantified path pattern:
((x:A)-[:R]->(z:B WHERE z.h > 2)){2}
This is equivalent to chaining together two iterations of the pattern, where the rightmost node of the first iteration is merged with the leftmost node of the second one. (See node pattern pairs for more details.)
(x:A)-[:R]->(z:B WHERE z.h > 2) (x:A)-[:R]->(z:B WHERE z.h > 2)
To avoid introducing equijoins between the two instances of x, and between the two instances of z, the variables are replaced with a set of fresh variables inside each iteration:
(x1:A)-[:R]->(z1:B WHERE z1.h > 2) (x2:A)-[:R]->(z2:B WHERE z2.h > 2)
Then the node variables in adjoining node patterns are merged:
(x1:A)-[:R]->({z1,x2}:A&B WHERE z1.h > 2)-[:R]->(z2:B WHERE z2.h > 2)
The fact that variables x2 and z1 are bound to matches of the same node pattern is represented with the notation {z1,x2}.
Outside of the pattern, the variables x and z will be group variables that contain lists of nodes.
Consider the quantified path pattern in the following query:
MATCH ((x:A)-[:R]->(z:B WHERE z.h > 2)){2}
RETURN [n in x | n.h] AS x_h, [n in z | n.h] AS z_h
This yields the following results:
| x_h | z_h |
|---|---|
|
|
|
|
Now the quantifier is changed to match lengths from one to five:
MATCH ((x:A)-[:R]->(z:B WHERE z.h > 2)){1,5}
RETURN [n in x | n.h] AS x_h, [n in z | n.h] AS z_h
Compared to the fixed-length quantifier {2}, this also matches paths of length one and three, but no matches exist for length greater than three:
| x_h | z_h |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Quantified relationships
Rules
A quantified relationship is an abbreviated form of a quantified path pattern, with only a single relationship pattern specified.
For example, the following quantified relationship:
(:A)-[r:R]->{m,n}(:B)
It matches paths with between m and n relationships of type R pointing left-to-right, starting at a node of type A, ending at a node of type B, and matching any nodes in-between.
It is equivalent to the following quantified path pattern:
(:A) (()-[r:R]->()){m,n} (:B)
With the expanded form of a quantified path pattern, it is possible to add predicates inside. Conversely, if the only predicate is a relationship type expression, query syntax can be more concise using a quantified relationship.
Note that, unlike a quantified path pattern, a quantified relationship must always have a node pattern on each side.
Examples
Matches one or more relationships with type R and of any direction, and any nodes connecting those relationships:
()-[:R]-+()
Matches paths consisting of two inbound subpaths, one with relationships of type R and one with relationships of type S, meeting at a node with label A:
()-[:R]->+(:A)<-[:S]-+()
Matches paths with any nodes and with one or more relationships of any direction and any type:
()--+()
Matches paths starting with nodes labeled A and ending with nodes labeled B, that traverse between two and three relationships of type R, matching any intermediate nodes:
(:A)-[r:R]->{2,3}(:B)
Quantifiers
The quantifiers here only refer to those used in quantified path patterns and quantified relationships.
Syntax
quantifier ::=
"*" | "+" | fixedQuantifier | generalQuantifier
fixedQuantifier ::= "{" unsignedDecimalInteger "}"
generalQuantifier ::= "{" lowerBound "," upperBound "}"
lowerBound ::= unsignedDecimalInteger
upperBound ::= unsignedDecimalInteger
unsignedDecimalInteger ::= [0-9]+
|
The |
Rules
The absence of an upper bound in the general quantifier syntax means there is no upper bound. The following table shows variants of the quantifier syntax and their canonical form:
| Variant | Canonical form | Description |
|---|---|---|
|
|
Between m and n iterations. |
|
|
1 or more iterations. |
|
|
0 or more iterations. |
|
|
Exactly n iterations. |
|
|
m or more iterations. |
|
|
Between 0 and n iterations. |
|
|
0 or more iterations. |
Note that a quantified path pattern with the quantifier {1} is not equivalent to a fixed-length path pattern.
Although the resulting quantified path pattern will match on the same paths the fixed-length path contained in it would without the quantifier, the presence of the quantifier means that all variables within the path pattern will be exposed as group variables.
Variable-length relationships
Prior to the introduction of the syntax for quantified path patterns and quantified relationships, the only way in Cypher® to match paths of variable length was with a variable-length relationship. This syntax is still available, but it is not GQL conformant. It is equivalent to the syntax for quantified relationships, with the following differences:
-
Position and syntax of quantifier.
-
Semantics of the asterisk symbol.
-
Type expressions are limited to the disjunction operator.
-
The WHERE clause is not allowed.
Syntax
varLengthRelationship ::=
"<-[" varLengthFiller "]-"
| "-[" varLengthFiller "]->"
| "-[" varLengthFiller "]-"
varLengthFiller ::= [ relationshipVariable ] [ varLengthTypeExpression ]
[ varLengthQuantifier ] [ propertyKeyValueExpression ]
varLengthTypeExpression ::= ":" varLengthTypeTerm
varLengthTypeTerm ::=
typeIdentifier
| varLengthTypeTerm "|" varLengthTypeTerm
varLengthQuantifier ::= varLengthVariable | varLengthFixed
varLengthVariable ::= "*" [ [ lowerBound ] ".." [ upperBound ] ]
varLengthFixed ::= "*" fixedBound
fixedBound ::= unsignedDecimalInteger
lowerBound ::= unsignedDecimalInteger
upperBound ::= unsignedDecimalInteger
unsignedDecimalInteger ::= [0-9]+
|
The |
For rules on valid relationship variable names, see Cypher naming rules.
Rules
The following table shows variants of the variable-length quantifier syntax and their equivalent quantifier form (the form used by quantified path patterns):
| Variant | Description | Quantified relationship equivalent | Quantified path pattern equivalent |
|---|---|---|---|
|
1 or more iterations. |
|
|
|
Exactly n iterations. |
|
|
|
Between m and n iterations. |
|
|
|
m or more iterations. |
|
|
|
0 or more iterations. |
|
|
|
Between 1 and n iterations. |
|
|
|
Between 0 and n iterations. |
|
|
Note that * used here on its own is not the same as the Kleene star (an operator that represents zero or more repetitions), as the Kleene star has a lower bound of zero.
The above table can be used to translate the quantifier used in variable-length relationships.
The rules given for quantified path patterns would apply to the translation.
This table shows some examples:
| Variable-length relationship | Equivalent quantified path pattern |
|---|---|
|
|
|
|
|
|
Equijoins
The variable of a variable-length relationship can be used in subsequent patterns to refer to the list of relationships the variable is bound to. This is the same as the equijoin for variables bound to single nodes or relationships.
This section uses the following graph:
To recreate the graph, run the following query against an empty Neo4j database:
CREATE ({name: 'Filipa'})-[:KNOWS]->({name:'Anders'})-[:KNOWS]->
({name:'Dilshad'})
In the following query, the node variables will be bound to the same nodes:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
MATCH (c)<-[r*1..2]-(d)
RETURN a = c, b = d, size(r)
| a = c | b = d | size(r) |
|---|---|---|
|
|
|
|
|
|
Rows: 2 |
||
The list of relationships keeps its order.
This means that in the following query, where the direction of the variable-length relationship in the second MATCH is switched, the equijoin will only match once, when there is a single relationship:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
MATCH (c)-[r*1..2]->(d)
RETURN a = c, b = d, size(r)
| a = c | b = d | size(r) |
|---|---|---|
|
|
|
Rows: 1 |
||
The variable r can be reversed in order like any list, and made to match the switch in relationship pattern direction:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
WITH a, b, reverse(r) AS s
MATCH (c)-[s*1..2]->(d)
RETURN a = d, b = c, size(s)
| a = d | b = c | size() |
|---|---|---|
|
|
|
|
|
|
Rows: 2 |
||
Changing the bounds on subsequent MATCH statements will mean that only the overlapping lengths of the quantifier bounds will produce results:
MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)
MATCH (c)<-[r*2..3]-(d)
RETURN a = c, b = d, size(r)
| a = c | b = d | size(r) |
|---|---|---|
|
|
|
Rows: 1 |
||
By default, only allows paths to traverse a relationship once, repeating a variable-length relationship in the same graph pattern will yield no results.
(The same is not true if a query uses REPEATABLE ELEMENTS).
For example, this MATCH clause will never pass on any intermediate results to subsequent clauses:
MATCH (x)-[r*1..2]->(y)-[r*1..2]->(z)
Attempting to repeat a variable-length relationship in a single relationship pattern will raise an error.
For example, the following pattern raises an error because the variable r appears in both a variable-length relationship and a fixed-length relationship:
MATCH (x)-[r*1..2]->(y)-[r]->(z)
Examples
The following pattern matches paths starting with nodes labeled A and ending with nodes labeled B, that traverse between two and three relationships of type R:
(:A)-[r:R*2..3]->(:B)
The following traverses relationships of type R or S or T exactly five times:
()-[r:R|S|T*5]->()
The following traverses any relationship between zero and five times, with the path beginning at nodes labeled A and ending at nodes labeled B.
Note that this will also return all nodes that have both labels A and B for the case where zero relationships are traversed:
(:A)-[*0..5]->(:B)
If the lower bound is removed, it will default to one, and will no longer return paths of length zero, i.e. single nodes:
(:A)-[*..5]->(:B)
The following pattern traverses one or more relationships of any direction that have property p = $param:
()-[* {p: $param}]-()
Node pattern pairs
It is not valid syntax to write a pair of node patterns next to each other. For example, all of the following would raise a syntax error:
(a:A)(b:B)
(a:A)(b:B)<-[r:R]-(c:C)
(a:A)<--(b:B)(c:C)-->(d:C)
However, the placing of pairs of node patterns next to each other is valid where it results indirectly from the expansion of quantified path patterns.
Iterations of quantified path patterns
When a quantified path pattern is expanded, the fixed path pattern contained in its parentheses is repeated and chained. This results in pairs of node patterns sitting next to each other. Take the following quantified path pattern as an example:
((x:X)<--(y:Y)){3}
This is expanded by repeating the fixed path pattern (x:X)←-(y:Y) three times, with indices on the variables to show that no equijoin is implied (see equijoins for more information):
(x1:X)<--(y1:Y)(x2:X)<--(y2:Y)(x3:X)<--(y3:Y)
The result is that two pairs of node patterns end up adjoining each other, (y1:Y)(x2:X) and (y2:Y)(x3:X).
During the matching process, each pair of node patterns will match the same nodes, and those nodes will satisfy the conjunction of the predicates in the node patterns.
For example, in the first pair both y1 and x2 will bind to the same node, and that node must have labels X and Y.
This expansion and binding is depicted in the following diagram:
Simple path patterns and quantified path patterns
Pairs of node patterns are also generated when a simple path pattern is placed next to a quantified path. For example, consider the following path pattern:
(:A)-[:R]->(:B) ((:X)<--(:Y)){1,2}
After expanding the iterations of the quantified path pattern, the right-hand node pattern (:B) adjoins the left-hand node pattern (:X).
The result will match the same paths as the union of matches of the following two path patterns:
(:A)-[:R]->(:B&X)<--(:Y)
(:A)-[:R]->(:B&X)<--(:Y&X)<--(:Y)
If the simple path pattern is on the right of the quantified path pattern, its leftmost node (:A) adjoins the rightmost node (:Y) of the last iteration of the quantified path pattern.
For example, the following:
((:X)<--(:Y)){1,2} (:A)-[:R]->(:B)
will match the same paths as the union of the following two path patterns:
(:X)<--(:Y&A)-[:R]->(:B)
(:X)<--(:Y&X)<--(:Y&A)-[:R]->(:B)
Pairs of quantified path patterns
When two quantified path patterns adjoin, the rightmost node of the last iteration of the first pattern is merged with the leftmost node of the first iteration of the second pattern. For example, the following adjoining patterns:
((:A)-[:R]->(:B)){2} ((:X)<--(:Y)){1,2}
will match the same set of paths as the union of the paths matched by these two path patterns:
(:A)-[:R]->(:B&A)-[:R]->(:B&X)<--(:Y)
(:A)-[:R]->(:B&A)-[:R]->(:B&X)<--(:Y&X)<--(:Y)
Zero iterations
If the quantifier allows for zero iterations of a pattern, for example {0,3}, then the 0th iteration of that pattern results in the node patterns on either side pairing up.
For example, the following path pattern:
(:X) ((a:A)-[:R]->(b:B)){0,1} (:Y)
will match the same set of paths as the union of the paths matched by the following two path patterns:
(:X&Y)
(:X&A)-[:R]->(:B&Y)