Primer
This section contains a primer covering some fundamental features of graph pattern matching with Cypher® queries.
Example graph
The example graph used in this tutorial is a model of train Stations
, and different train services with Stops
that call at the Stations
.
To recreate the graph, run the following query against an empty Neo4j database:
CREATE (n1:Station {name: 'Denmark Hill'}),
(n5:Station {name: 'Battersea Park'}),
(n6:Station {name: 'Wandsworth Road'}),
(n15:Station {name: 'Clapham High Street'}),
(n16:Station {name: 'Peckham Rye'}),
(n17:Station {name: 'Brixton'}),
(n14:Station {name: 'London Victoria'}),
(n18:Station {name: 'Clapham Junction'}),
(p10:Stop {departs: time('22:37'), arrives: time('22:36')}),
(p0:Stop {departs: time('22:41'), arrives: time('22:41')}),
(p2:Stop {departs: time('22:43'), arrives: time('22:43')}),
(p17:Stop {arrives: time('22:50'), departs: time('22:50')}),
(p18:Stop {arrives: time('22:46'), departs: time('22:46')}),
(p19:Stop {departs: time('22:33'), arrives: time('22:31')}),
(p21:Stop {arrives: time('22:55')}),
(p20:Stop {departs: time('22:44'), arrives: time('22:43')}),
(p22:Stop {arrives: time('22:55')}),
(p23:Stop {arrives: time('22:48')}),
(n15)-[:LINK {distance: 1.96}]->(n1)-[:LINK {distance: 0.86}]->(n16),
(n15)-[:LINK {distance: 0.39}]->(n6)<-[:LINK {distance: 0.7}]-(n5)-[:LINK {distance: 1.24}]->(n14), (n5)-[:LINK {distance: 1.45}]->(n18),
(n14)<-[:LINK {distance: 3.18}]-(n17)-[:LINK {distance: 1.11}]->(n1),
(p2)-[:CALLS_AT]->(n6), (p17)-[:CALLS_AT]->(n5), (p19)-[:CALLS_AT]->(n16),
(p22)-[:CALLS_AT]->(n14), (p18)-[:CALLS_AT]->(n18), (p0)-[:CALLS_AT]->(n15), (p23)-[:CALLS_AT]->(n5), (p20)-[:CALLS_AT]->(n1),
(p21)-[:CALLS_AT]->(n14), (p10)-[:CALLS_AT]->(n1), (p19)-[:NEXT]->(p10)-[:NEXT]->(p0)-[:NEXT]->(p2)-[:NEXT]->(p23),
(p22)<-[:NEXT]-(p17)<-[:NEXT]-(p18), (p21)<-[:NEXT]-(p20)
Matching fixed-length paths
An empty pair of parentheses is a node pattern that will match any node. This example gets a count of all the nodes in the graph:
MATCH ()
RETURN count(*) AS numNodes
numNodes |
---|
|
Rows: 1 |
Adding a label to the node pattern will filter on nodes with that label (see label expressions).
The following query gets a count of all the nodes with the label Stop
:
MATCH (:Stop)
RETURN count(*) AS numStops
numStops |
---|
|
Rows: 1 |
Path patterns can match relationships and the nodes they connect.
The following query gets the arrival time of all trains calling at Denmark Hill
:
MATCH (s:Stop)-[:CALLS_AT]->(:Station {name: 'Denmark Hill'})
RETURN s.arrives AS arrivalTime
arrivalTime |
---|
|
|
Rows: 2 |
Path patterns can include inline WHERE
clauses.
The following query gets the next calling point of the service that departs Denmark Hill
at 22:37
:
MATCH (n:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-
(s:Stop WHERE s.departs = time('22:37'))-[:NEXT]->
(:Stop)-[:CALLS_AT]->(d:Station)
RETURN d.name AS nextCallingPoint
nextCallingPoint |
---|
|
Rows: 1 |
For more information, see Fixed-length patterns.
Matching variable-length paths
Variable-length paths that only traverse relationships with a specified type can be matched with quantified relationships.
Any variable declared in the relationship pattern will return a list of the relationships traversed.
The following query returns the total distance traveled via all LINK
s connecting the stations Peckham Rye
and Clapham Junction
:
MATCH (:Station {name: 'Peckham Rye'})-[link:LINK]-+
(:Station {name: 'Clapham Junction'})
RETURN reduce(acc = 0.0, l IN link | round(acc + l.distance, 2)) AS
totalDistance
totalDistance |
---|
|
|
Rows: 2 |
-[:LINK]-+ is a quantified relationship. It is composed of a relationship pattern -[:LINK]- that matches relationships going in either direction, and a quantifier + that means it will match one or more relationships. As no node patterns are included with quantified relationships, they will match any intermediate nodes.
|
Variable-length paths can also be matched with quantified path patterns, which allow both WHERE
clauses and accessing the nodes traversed by the path.
The following query returns a list of calling points on routes from Peckham Rye
to London Victoria
, where no distance between stations is greater than two miles:
MATCH (:Station {name: 'Peckham Rye'})
(()-[link:LINK]-(s) WHERE link.distance <= 2)+
(:Station {name: 'London Victoria'})
UNWIND s AS station
RETURN station.name AS callingPoint
callingPoint |
---|
|
|
|
|
|
Rows: 5 |
WHERE
clauses inside node patterns can themselves include path patterns.
The following query using an EXISTS subquery to anchor on the last Stop
in a sequence of Stops
, and returns the departure times, arrival times and final destination of all services calling at Denmark Hill
:
MATCH (:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-(s1:Stop)-[:NEXT]->+
(sN:Stop WHERE NOT EXISTS { (sN)-[:NEXT]->(:Stop) })-[:CALLS_AT]->
(d:Station)
RETURN s1.departs AS departure, sN.arrives AS arrival,
d.name AS finalDestination
departure | arrival | finalDestination |
---|---|---|
|
|
|
|
|
|
Rows: 2 |
Node variables declared inside quantified path patterns become bound to lists of nodes, which can be unwound and used in subsequent MATCH
clauses.
The following query lists the calling points of the Peckham Rye
to Battersea Park
train service:
MATCH (:Station {name: 'Peckham Rye'})<-[:CALLS_AT]-(:Stop)
(()-[:NEXT]->(s:Stop))+
()-[:CALLS_AT]->(:Station {name: 'Battersea Park'})
UNWIND s AS stop
MATCH (stop)-[:CALLS_AT]->(station:Station)
RETURN stop.arrives AS arrival, station.name AS callingPoint
arrival | callingPoint |
---|---|
|
|
|
|
|
|
|
|
Rows: 4 |
Repeating a node variable in a path pattern enables the same node to be bound more than once in a path (see equijoins).
The following query finds all stations that are on a cycle (i.e., pass through the same Station
more than once) formed by the LINK
between Stations
:
MATCH (n:Station)-[:LINK]-+(n)
RETURN DISTINCT n.name AS station
station |
---|
|
|
|
|
|
|
Rows: 6 |
Complex, non-linear paths can be matched using graph patterns, a comma separated list of path patterns that are connected via repeated node variables, i.e. equijoins.
For example, a passenger is traveling from Denmark Hill
and wants to join the train service to London Victoria
that leaves Clapham Junction
at 22:46
.
The following query finds the departure time from Denmark Hill
as well as the changeover Station
and time of arrival:
MATCH (:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-
(s1:Stop)-[:NEXT]->+(s2:Stop)-[:CALLS_AT]->
(c:Station)<-[:CALLS_AT]-(x:Stop),
(:Station {name: 'Clapham Junction'})<-[:CALLS_AT]-
(t1:Stop)-[:NEXT]->+(x)-[:NEXT]->+(:Stop)-[:CALLS_AT]->
(:Station {name: 'London Victoria'})
WHERE t1.departs = time('22:46')
AND s2.arrives < x.departs
RETURN s1.departs AS departure, s2.arrives AS changeArrival,
c.name AS changeAt
departure | changeArrival | changeAt |
---|---|---|
|
|
|
Rows: 1 |
For more information, see Variable-length patterns.
Matching shortest paths
The shortest path between two nodes can be found using the SHORTEST
keyword:
MATCH p = SHORTEST 1
(:Station {name: "Brixton"})
(()-[:LINK]-(:Station))+
(:Station {name: "Clapham Junction"})
RETURN [station IN nodes(p) | station.name] AS route
route |
---|
|
Rows: 1 |
To find all shortest paths, the ALL SHORTEST
keywords can be used:
MATCH p = ALL SHORTEST
(:Station {name: "Denmark Hill"})
(()-[:LINK]-(:Station))+
(:Station {name: "Clapham Junction"})
RETURN [station IN nodes(p) | station.name] AS route
route |
---|
|
|
Rows: 2 |
In general, SHORTEST k
can be used to return the k
shortest paths.
The following returns the two shortest paths:
MATCH p = SHORTEST 2
(:Station {name: "Denmark Hill"})
(()-[:LINK]-(:Station))+
(:Station {name: "Clapham High Street"})
RETURN [station IN nodes(p) | station.name] AS route
route |
---|
|
|
Rows: 2 |
For more information, see Shortest paths.