Primer

This section contains a primer covering some fundamental features of graph pattern matching with Cypher^® queries.

Example graph

The example graph used in this tutorial is a model of train Stations, and different train services with Stops that call at the Stations.

To recreate the graph, run the following query against an empty Neo4j database:

Query

CREATE (n1:Station {name: 'Denmark Hill'}),
(n5:Station {name: 'Battersea Park'}),
(n6:Station {name: 'Wandsworth Road'}),
(n15:Station {name: 'Clapham High Street'}),
(n16:Station {name: 'Peckham Rye'}),
(n17:Station {name: 'Brixton'}),
(n14:Station {name: 'London Victoria'}),
(n18:Station {name: 'Clapham Junction'}),
(p10:Stop {departs: time('22:37'), arrives: time('22:36')}),
(p0:Stop {departs: time('22:41'), arrives: time('22:41')}),
(p2:Stop {departs: time('22:43'), arrives: time('22:43')}),
(p17:Stop {arrives: time('22:50'), departs: time('22:50')}),
(p18:Stop {arrives: time('22:46'), departs: time('22:46')}),
(p19:Stop {departs: time('22:33'), arrives: time('22:31')}),
(p21:Stop {arrives: time('22:55')}),
(p20:Stop {departs: time('22:44'), arrives: time('22:43')}),
(p22:Stop {arrives: time('22:55')}),
(p23:Stop {arrives: time('22:48')}),
(n15)-[:LINK {distance: 1.96}]->(n1)-[:LINK {distance: 0.86}]->(n16),
(n15)-[:LINK {distance: 0.39}]->(n6)<-[:LINK {distance: 0.7}]-(n5)-[:LINK {distance: 1.24}]->(n14), (n5)-[:LINK {distance: 1.45}]->(n18),
(n14)<-[:LINK {distance: 3.18}]-(n17)-[:LINK {distance: 1.11}]->(n1),
(p2)-[:CALLS_AT]->(n6), (p17)-[:CALLS_AT]->(n5), (p19)-[:CALLS_AT]->(n16),
(p22)-[:CALLS_AT]->(n14), (p18)-[:CALLS_AT]->(n18), (p0)-[:CALLS_AT]->(n15), (p23)-[:CALLS_AT]->(n5), (p20)-[:CALLS_AT]->(n1),
(p21)-[:CALLS_AT]->(n14), (p10)-[:CALLS_AT]->(n1), (p19)-[:NEXT]->(p10)-[:NEXT]->(p0)-[:NEXT]->(p2)-[:NEXT]->(p23),
(p22)<-[:NEXT]-(p17)<-[:NEXT]-(p18), (p21)<-[:NEXT]-(p20)

Matching fixed-length paths

An empty pair of parentheses is a node pattern that will match any node. This example gets a count of all the nodes in the graph:

MATCH ()
RETURN count(*) AS numNodes

Table 1. Result
numNodes
`18`
Rows: 1

Adding a label to the node pattern will filter on nodes with that label (see label expressions). The following query gets a count of all the nodes with the label Stop:

MATCH (:Stop)
RETURN count(*) AS numStops

Table 2. Result
numStops
`10`
Rows: 1

Path patterns can match relationships and the nodes they connect. The following query gets the arrival time of all trains calling at Denmark Hill:

MATCH (s:Stop)-[:CALLS_AT]->(:Station {name: 'Denmark Hill'})
RETURN s.arrives AS arrivalTime

Table 3. Result
arrivalTime
`"22:36:00Z"`
`"22:43:00Z"`
Rows: 2

Path patterns can include inline WHERE clauses. The following query gets the next calling point of the service that departs Denmark Hill at 22:37:

MATCH (n:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-
        (s:Stop WHERE s.departs = time('22:37'))-[:NEXT]->
        (:Stop)-[:CALLS_AT]->(d:Station)
RETURN d.name AS nextCallingPoint

Table 4. Result
nextCallingPoint
`"Clapham High Street"`
Rows: 1

For more information, see Fixed length patterns.

Matching variable-length paths

Variable-length paths that only traverse relationships with a specified type can be matched with quantified relationships. Any variable declared in the relationship pattern will return a list of the relationships traversed. The following query returns the total distance traveled via all LINKs connecting the stations Peckham Rye and Clapham Junction:

Query

MATCH (:Station {name: 'Peckham Rye'})-[link:LINK]-+
        (:Station {name: 'Clapham Junction'})
RETURN reduce(acc = 0.0, l IN link | round(acc + l.distance, 2)) AS
         totalDistance

Table 5. Result
totalDistance
`7.84`
`5.36`
Rows: 2

-[:LINK]-+ is a quantified relationship. It is composed of a relationship pattern -[:LINK]- that matches relationships going in either direction, and a quantifier + that means it will match one or more relationships. As no node patterns are included with quantified relationships, they will match any intermediate nodes.

Variable-length paths can also be matched with quantified path patterns, which allow both WHERE clauses and accessing the nodes traversed by the path. The following query returns a list of calling points on routes from Peckham Rye to London Victoria, where no distance between stations is greater than two miles:

Query

MATCH (:Station {name: 'Peckham Rye'})
      (()-[link:LINK]-(s) WHERE link.distance <= 2)+
      (:Station {name: 'London Victoria'})
UNWIND s AS station
RETURN station.name AS callingPoint

Table 6. Result
callingPoint
`"Denmark Hill"`
`"Clapham High Street"`
`"Wandsworth Road"`
`"Battersea Park"`
`"London Victoria"`
Rows: 5

WHERE clauses inside node patterns can themselves include path patterns. The following query using an EXISTS subquery to anchor on the last Stop in a sequence of Stops, and returns the departure times, arrival times and final destination of all services calling at Denmark Hill:

Query

MATCH (:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-(s1:Stop)-[:NEXT]->+
        (sN:Stop WHERE NOT EXISTS { (sN)-[:NEXT]->(:Stop) })-[:CALLS_AT]->
        (d:Station)
RETURN s1.departs AS departure, sN.arrives AS arrival,
       d.name AS finalDestination

Table 7. Result
departure	arrival	finalDestination
`'22:37:00Z'`	`'22:48:00Z'`	`"Battersea Park"`
`'22:44:00Z'`	`'22:55:00Z'`	`"London Victoria"`
Rows: 2

Node variables declared inside quantified path patterns become bound to lists of nodes, which can be unwound and used in subsequent MATCH clauses. The following query lists the calling points of the Peckham Rye to Battersea Park train service:

Query

MATCH (:Station {name: 'Peckham Rye'})<-[:CALLS_AT]-(:Stop)
      (()-[:NEXT]->(s:Stop))+
      ()-[:CALLS_AT]->(:Station {name: 'Battersea Park'})
UNWIND s AS stop
MATCH (stop)-[:CALLS_AT]->(station:Station)
RETURN stop.arrives AS arrival, station.name AS callingPoint

Table 8. Result
arrival	callingPoint
`"22:36:00Z"`	`"Denmark Hill"`
`"22:41:00Z"`	`"Clapham High Street"`
`"22:43:00Z"`	`"Wandsworth Road"`
`"22:48:00Z"`	`"Battersea Park"`
Rows: 4

Repeating a node variable in a path pattern enables the same node to be bound more than once in a path (see equijoins). The following query finds all stations that are on a cycle (i.e., pass through the same Station more than once) formed by the LINK between Stations:

Query

MATCH (n:Station)-[:LINK]-+(n)
RETURN DISTINCT n.name AS station

Table 9. Result
station
`"Denmark Hill"`
`"Battersea Park"`
`"Wandsworth Road"`
`"Clapham High Street"`
`"Brixton"`
`"London Victoria"`
Rows: 6

Complex, non-linear paths can be matched using graph patterns, a comma separated list of path patterns that are connected via repeated node variables, i.e. equijoins. For example, a passenger is traveling from Denmark Hill and wants to join the train service to London Victoria that leaves Clapham Junction at 22:46. The following query finds the departure time from Denmark Hill as well as the changeover Station and time of arrival:

Query

MATCH (:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-
        (s1:Stop)-[:NEXT]->+(s2:Stop)-[:CALLS_AT]->
        (c:Station)<-[:CALLS_AT]-(x:Stop),
       (:Station {name: 'Clapham Junction'})<-[:CALLS_AT]-
         (t1:Stop)-[:NEXT]->+(x)-[:NEXT]->+(:Stop)-[:CALLS_AT]->
         (:Station {name: 'London Victoria'})
WHERE t1.departs = time('22:46')
      AND s2.arrives < x.departs
RETURN s1.departs AS departure, s2.arrives AS changeArrival,
       c.name AS changeAt

Table 10. Result
departure	changeArrival	changeAt
`"22:37:00Z"`	`"22:48:00Z"`	`"Battersea Park"`
Rows: 1

For more information, see Variable length patterns.

Matching shortest paths

The shortest path between two nodes can be found using the SHORTEST keyword:

Query

MATCH p = SHORTEST 1
  (:Station {name: "Brixton"})
  (()-[:LINK]-(:Station))+
  (:Station {name: "Clapham Junction"})
RETURN [station IN nodes(p) | station.name] AS route

Table 11. Result
route
`["Brixton", "London Victoria", "Battersea Park", "Clapham Junction"]`
Rows: 1

To find all shortest paths, the ALL SHORTEST keywords can be used:

Query

MATCH p = ALL SHORTEST
  (:Station {name: "Denmark Hill"})
  (()-[:LINK]-(:Station))+
  (:Station {name: "Clapham Junction"})
RETURN [station IN nodes(p) | station.name] AS route

Table 12. Result
route
`["Denmark Hill", "Clapham High Street", "Wandsworth Road", "Battersea Park", "Clapham Junction"]`
`["Denmark Hill", "Brixton", "London Victoria", "Battersea Park", "Clapham Junction"]`
Rows: 2

In general, SHORTEST k can be used to return the k shortest paths. The following returns the two shortest paths:

Query

MATCH p = SHORTEST 2
  (:Station {name: "Denmark Hill"})
  (()-[:LINK]-(:Station))+
  (:Station {name: "Clapham High Street"})
RETURN [station IN nodes(p) | station.name] AS route

Table 13. Result
route
`["Denmark Hill", "Clapham High Street"]`
`["Denmark Hill", "Brixton", "London Victoria", "Battersea Park", "Wandsworth Road", "Clapham High Street"]`
Rows: 2

For more information, see Shortest paths.