Path patterns and graph patterns
Path patterns
A path pattern is the top level pattern that is matched against paths in a graph.
Syntax
pathPattern ::= [ pathVariableDeclaration ] [ pathPatternPrefix ] pathPatternExpression
pathVariableDeclaration ::= pathVariable "="
pathPatternPrefix ::= pathModePrefix | shortestPathSelector
pathPatternExpression ::= { parenthesizedPathPatternExpression | pathPatternPhrase }
parenthesizedPathPatternExpression ::=
"("
[ subpathVariableDeclaration ]
pathPatternExpression
[ parenthesizedPathPatternWhereClause ]
")"
subpathVariableDeclaration ::= pathVariable "="
pathPatternPhrase ::= [{ simplePathPattern | quantifiedPathPattern }]+
simplePathPattern ::= nodePattern
[ { relationshipPattern | quantifiedRelationship } nodePattern ]*
parenthesizedPathPatternWhereClause ::= "WHERE" booleanExpression
See the related section for each of the syntax rules:
|
|
|
|
|
|
|
|
|
Rules
The minimum number of elements in the path pattern must be greater than zero. For example, a path pattern that is a quantified path pattern with a quantifier that has a lower bound of zero is not allowed:
((n)-[r]->(m)){0,10}
A path pattern must always begin and end with a node pattern.
(n)-[r]->(m)-[s]-
A path pattern may be composed of a concatenation of simple and quantified path patterns. Two simple path patterns, however, may not be placed next to each other.
(a)<-[s]-(b) (c)-[t]->(d)
When a path pattern is matched to paths in a graph, nodes can be revisited but relationships cannot.
A subpath variable (a path variable declared inside a parenthesized path pattern expression) can only be used if a shortest paths mode is also specified.
MATCH SHORTEST 1 (p = (a)-[r]->+(b) WHERE length(p) > 3)
MATCH (:A)-[:S]->(:B) (p = ((a)-[r]->(b))+)
See graph patterns for rules on declaring variables multiple times.
Examples
A single node pattern is allowed as it has at least one element:
(n)
A simple path pattern with more than one element:
(a:A)<-[{p: 30}]-(b)-[t WHERE t.q > 0]->(c:C)
A quantified path pattern can have a lower bound of zero in its quantifier as long as it abuts other patterns that have at least one element:
(:A) ((:X)-[:R]-()){0,10} (:B)
A quantified relationship can also have a lower bound of zero as long as the overall path pattern has at least one element:
(:A)-[:R]->{0,10}(:B)
A concatenation of simple and quantified path patterns:
(a)<-[s]-(b)-[t]->(c) ((n)-[r]->(m)){0,10} (:X)
Referencing non-local node variable in a simple path pattern:
(a)<-[s:X WHERE a.p = s.p]-(b)
Referencing a non-local relationship variable within a quantified path pattern:
(:A) ((a)<-[s:X WHERE a.p = s.p]-(b)){,5}
A variable that was introduced in a previous clause can be referenced as long as that variable was defined outside of a quantified path pattern:
MATCH (n)
MATCH ()-[r WHERE r.q = n.q]-() (()<-[s:X WHERE n.p = s.p]-()){2,3}
Paths matched by the path pattern can be assigned to a variable:
MATCH p = ()-[r WHERE r.q = n.q]-()
Graph patterns
A graph pattern is a comma separated list of one or more path patterns.
It is the top level construct provided to MATCH.
Whether or not a graph pattern can match a relationship more than once is governed by the selected match mode.
Syntax
graphPattern ::=
pathPattern [ "," pathPattern ]* [ graphPatternWhereClause ]
graphPatternWhereClause ::= "WHERE" booleanExpression
Rules
The rules for path patterns apply to each constituent path pattern of a graph pattern.
Variable references
If a variable is declared inside a quantified path pattern, then it can be treated as a singleton only from within the quantified path pattern it was declared in. Outside of that quantified path pattern, it must be treated as a group variable.
((n)-[r]->(m WHERE r.p = m.q))+
(n)-[r]->+(m WHERE all(rel in r WHERE rel.q > m.q))
(n)-[r]->+(m WHERE r.p = m.q)
Equijoin
If a node variable is declared more than once in a path pattern, it is expressing an equijoin.
This is an operation that requires that each node pattern with the same node variable be bound to the same node.
For example, the following pattern refers to the same node twice with the variable a, forming a cycle:
(a)-->(b)-->(c)-->(a)
The following pattern refers to the same node with variable b in different path patterns of the same graph pattern, forming a "T" shaped pattern:
(a)-->(b)-->(c), (b)-->(e)
Equijoins can only be made using variables outside of quantified path patterns. The following would not be a valid equijoin:
(a)-->(b)-->(c), ((b)-->(e))+ (:X)
If no equijoin exists between path patterns in a graph pattern, then a Cartesian join is formed from the sets of matches for each path pattern. An equijoin can be expressed between relationship patterns by declaring a relationship variable multiple times. However, as relationships can only be traversed once in a given match, no solutions would be returned.
Examples
The WHERE clause can refer to variables inside and outside of quantified path patterns:
(a)-->(b)-->(c), (b) ((d)-->(e))+ WHERE any(n in d WHERE n.p = a.p)
An equijoin can be formed to match "H" shaped graphs:
(:A)-->(x)--(:B), (x)-[:R]->+(y), (:C)-->(y)-->(:D)
With no variables in common, this graph pattern will result in a Cartesian join between the sets of matches for the two path patterns:
(a)-->(b)-->(c), ((d)-->(e))+
Multiple equijoins can be formed between path patterns:
(:X)-->(a:A)-[!:R]->+(b:B)-->(:Y), (a)-[:R]->+(b)
Variables declared in a previous MATCH can be referenced inside of a quantified path pattern:
MATCH (n {p = 'ABC'})
MATCH (n)-->(m:A)-->(:B), (m) (()-[r WHERE r.p <> n.p]->())+ (:C)
The repetition of a relationship variable in the following yields no solutions due to Cypher® enforcing relationship uniqueness within a match for a graph pattern:
MATCH ()-[r]->()-->(), ()-[r]-()