Shortest paths
The Cypher^{®} keyword SHORTEST
is used to find variations of the shortest paths between nodes.
This includes the ability to look for the shortest, secondshortest (and so on) paths, all available shortest paths, and groups of paths containing the same pattern length.
The ANY
keyword, which can be used to test the reachability of nodes from a given node(s), is also explained, as is how to apply filters in queries using SHORTEST
.
SHORTEST functionally replaces and extends the shortestPath() and allShortestPaths() functions.
Both functions can still be used, but they are not GQL conformant.
For more information, see Syntax and semantics → The shortestPath() and allShortestPaths() functions.

Note on Cypher and GDS shortest paths
Both Cypher and Neo4j´s Graph Data Science (GDS) library can be used to find variations of the shortest paths between nodes.
Use Cypher if:

You need to specify complex graph navigation via quantified path patterns.

Creating a graph projection takes too long.

GDS is not available in your instance, or the size of the GDS projection is too large for your instance.
Use GDS if:
To read more about the shortest path algorithms included in the GDS library, see GDS Graph algorithms → Path finding.
SHORTEST k
This section uses the following graph:
To recreate it, run the following query against an empty Neo4j database:
CREATE (asc:Station {name:"Ashchurch"}),
(bmv:Station {name:"Bromsgrove"}),
(cnm:Station {name:"Cheltenham Spa"}),
(dtw:Station {name:"Droitwich Spa"}),
(hby:Station {name:"Hartlebury"}),
(psh:Station {name:"Pershore"}),
(wop:Station {name:"Worcestershire Parkway"}),
(wof:Station {name:"Worcester Foregate Street"}),
(wos:Station {name:"Worcester Shrub Hill"})
CREATE (asc)[:LINK {distance: 7.25}]>(cnm),
(asc)[:LINK {distance: 11.29}]>(wop),
(asc)[:LINK {distance: 14.75}]>(wos),
(bmv)[:LINK {distance: 31.14}]>(cnm),
(bmv)[:LINK {distance: 6.16}]>(dtw),
(bmv)[:LINK {distance: 12.6}]>(wop),
(dtw)[:LINK {distance: 5.64}]>(hby),
(dtw)[:LINK {distance: 6.03}]>(wof),
(dtw)[:LINK {distance: 5.76}]>(wos),
(psh)[:LINK {distance: 4.16}]>(wop),
(wop)[:LINK {distance: 3.71}]>(wos),
(wof)[:LINK {distance: 0.65}]>(wos)
The paths matched by a path pattern can be restricted to only the shortest (by number of hops) by including the keyword SHORTEST k
, where k
is the number of paths to match.
For example, the following example uses SHORTEST 1
to return the length of the shortest path between Worcester Shrub Hill
and Bromsgrove
:
MATCH p = SHORTEST 1 (wos:Station)[:LINK]+(bmv:Station)
WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN length(p) AS result
Note that this and the following examples in this section use a quantified relationship [:LINK]+ , which is composed of a relationship pattern [:LINK] and a postfix quantifier + . The relationship pattern is only concerned with following relationships with type LINK , and will otherwise traverse any node along the way. There is no arrowhead < or > on the relationship pattern, allowing the pattern to match relationships going in either direction. This represents the fact that trains can go in both directions along the LINK relationships between Stations. The + quantifier means that one or more relationships should be matched. For more information, see Syntax and semantics  quantified relationships.

result 


Rows: 1 
Although the query returned a single result, there are in fact two paths that are tied for shortest:
Because 1
was specified in SHORTEST
, only one of the paths is returned.
Which one is returned is nondeterministic.
If instead SHORTEST 2
is specified, all shortest paths in this example would be returned, and the result would be deterministic:
MATCH p = SHORTEST 2 (wos:Station)[:LINK]+(bmv:Station)
WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p)  n.name] AS stops
stops 



Rows: 2 
Increasing the number of paths will return the next shortest paths. Three paths are tied for the second shortest:
The following query returns all three of the second shortest paths, along with the two shortest paths:
MATCH p = SHORTEST 5 (wos:Station)[:LINK]+(bmv:Station)
WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p)  n.name] AS stops
stops 






Rows: 5 
If there had been only four possible paths between the two Stations, then only those four would have been returned.
ALL SHORTEST
To return all paths that are tied for shortest length, use the keywords ALL SHORTEST
:
MATCH p = ALL SHORTEST (wos:Station)[:LINK]+(bmv:Station)
WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p)  n.name] AS stops
stops 



Rows: 2 
SHORTEST k GROUPS
To return all paths that are tied for first, second, and so on up to the kth shortest length, use SHORTEST k GROUPS
.
For example, the following returns the first and second shortest length paths between Worcester Shrub Hill
and Bromsgrove
:
MATCH p = SHORTEST 2 GROUPS (wos:Station)[:LINK]+(bmv:Station)
WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p)  n.name] AS stops, length(p) AS pathLength
stops  pathLength 











Rows: 5 
The first group includes the two shortest paths with pathLength = 2
(as seen in the first two rows of the results), and the second group includes the three second shortest paths with pathLength = 3
(as seen in the last three rows of the results).
If more groups are specified than exist in the graph, only those paths that exist are returned.
For example, if the paths equal to one of the eight shortest paths are specified for Worcester Shrub Hill
to Bromsgrove
, only seven groups are returned:
MATCH p = SHORTEST 8 GROUPS (wos:Station)[:LINK]+(bmv:Station)
WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN length(p) AS pathLength, count(*) AS numPaths
pathLength  numPaths 















Rows: 7 
ANY
The ANY
keyword can be used to test the reachability of nodes from a given node(s).
It returns the same as SHORTEST 1
, but by using the ANY
keyword the intent of the query is clearer.
For example, the following query shows that there exists a route from Pershore
to Bromsgrove
where the distance between each pair of stations is less than 10 miles:
MATCH path = ANY
(:Station {name: 'Pershore'})[l:LINK WHERE l.distance < 10]+(b:Station {name: 'Bromsgrove'})
RETURN [r IN relationships(path)  r.distance] AS distances
distances 


Rows: 1 
Partitions
When there are multiple start or end nodes matching a path pattern, the matches are partitioned into distinct pairs of start and end nodes prior to selecting the shortest paths; a partition is one distinct pair of start node and end node. The selection of shortest paths is then done from all paths that join the start and end node of a given partition. The results are then formed from the union of all the shortest paths found for each partition.
For example, if the start nodes of matches are bound to either Droitwich Spa
or Hartlebury
, and the end nodes are bound to either Ashchurch
or Cheltenham Spa
, there will be four distinct pairs of start and end nodes, and therefore four partitions:
Start node  End node 









The following query illustrates how these partitions define the sets of results within which the shortest paths are selected.
It uses a pair of UNWIND
clauses to generate a Cartesian product of the names of the Stations
(all possible pairs of start node and end node), followed by the MATCH
clause to find the shortest two groups of paths for each pair of distinct start and end Stations
:
UNWIND ["Droitwich Spa", "Hartlebury"] AS a
UNWIND ["Ashchurch", "Cheltenham Spa"] AS b
MATCH SHORTEST 2 GROUPS (o:Station {name: a})[l]+(d:Station {name: b})
RETURN o.name AS start, d.name AS end,
size(l) AS pathLength, count(*) AS numPaths
ORDER BY start, end, pathLength
start  end  pathLength  numPaths 

































Rows: 8 
Each partition appears twice: once for the group of shortest paths and once for the group of second shortest paths.
For example, for the partition of Droitwich Spa
as the start
and Ashchurch
as the end
, the shortest path group (paths with length 2
) has one path, and the second shortest path group (paths with length 3
) has four paths.
Prefilters and postfilters
The position of a filter in a shortest path query will affect whether it is applied before or after selecting the shortest paths.
To see the difference, first consider a query that returns the shortest path from Hartlebury
to Cheltenham Spa
:
MATCH SHORTEST 1
(:Station {name: 'Hartlebury'})
(()(n))+
(:Station {name: 'Cheltenham Spa'})
RETURN [stop in n[..1]  stop.name] AS stops
stops 


Rows: 1 
Note that n[..1]
is a slicing operation that returns all elements of n
except the last.
If instead, the query uses a WHERE
clause at the MATCH
level to filter out routes that go via Bromsgrove, the filtering is applied after the shortest paths are selected.
This results in the only solution being removed, and no results being returned:
MATCH SHORTEST 1
(:Station {name: 'Hartlebury'})
(()(n:Station))+
(:Station {name: 'Cheltenham Spa'})
WHERE none(stop IN n[..1] WHERE stop.name = 'Bromsgrove')
RETURN [stop in n[..1]  stop.name] AS stops
stops 

Rows: 0 
There are two ways to turn a postfilter without solutions into a prefilter that returns solutions. One is to inline the predicate into the path pattern:
MATCH SHORTEST 1
(:Station {name: 'Hartlebury'})
(()(n:Station WHERE n.name <> 'Bromsgrove'))+
(:Station {name: 'Cheltenham Spa'})
RETURN [stop in n[..1]  stop.name] AS stops
stops 


Rows: 1 
The shortest journey that avoids Bromsgrove
is now returned.
An alternative is to wrap the path pattern and filter in parentheses (leaving the SHORTEST
keyword on the outside):
MATCH SHORTEST 1
( (:Station {name: 'Hartlebury'})
(()(n:Station))+
(:Station {name: 'Cheltenham Spa'})
WHERE none(stop IN n[..1] WHERE stop.name = 'Bromsgrove') )
RETURN [stop in n[..1]  stop.name] AS stops
stops 


Rows: 1 
Prefilter with a path variable
The previous section showed how to apply a filter before the shortest path selection by the use of parentheses. Placing a path variable declaration before the shortest path keywords, however, places it outside the scope of the parentheses. To reference a path variable in a prefilter, it has to be declared inside the parentheses.
To illustrate, consider this example that returns all shortest paths from Hartlebury
to each of the other Stations
:
MATCH p = SHORTEST 1 (:Station {name: 'Hartlebury'})+(b:Station)
RETURN b.name AS destination, length(p) AS pathLength
ORDER BY pathLength, destination
destination  pathLength 

















Rows: 8 
If the query is altered to only include routes that have an even number of stops, adding a WHERE
clause at the MATCH
level will not work, because it would be a postfilter.
It would return the results of the previous query with all routes with an odd number of stops removed:
MATCH p = SHORTEST 1 (:Station {name: 'Hartlebury'})+(b:Station)
WHERE length(p) % 2 = 0
RETURN b.name AS destination, length(p) AS pathLength
ORDER BY pathLength, destination
destination  pathLength 









Rows: 4 
To move the predicate to a prefilter, the path variable should be referenced from within the parentheses, and the shortest routes with an even number of stops will be returned for all the destinations:
MATCH SHORTEST 1
(p = (:Station {name: 'Hartlebury'})+(b:Station)
WHERE length(p) % 2 = 0 )
RETURN b.name AS destination, length(p) AS pathLength
ORDER BY pathLength, destination
destination  pathLength 

















Rows: 8 
Planning shortest path queries
This section describes the operators used when planning shortest path queries. For readers not familiar with Cypher execution plans and operators, it is recommended to first read the section Understanding execution plans.
There are two operators used to plan SHORTEST
queries:

StatefulShortestPath(All)
 uses a unidirectional breadthfirst search algorithm to find shortest paths from a previously matched start node to an end node that has not yet been matched. 
StatefulShortestPath(Into)
 uses a bidirectional breadthfirst search (BFS) algorithm, where two simultaneous BFS invocations are performed, one from the left boundary node and one from the right boundary node.
StatefulShortestPath(Into)
is used by the planner when both boundary nodes in the shortest path are estimated to match at most one node each.
Otherwise, StatefulShortestPath(All)
is used.
For example, the planner estimates that the left boundary node in the below query will match one node, and the right boundary node will match five nodes,
and chooses to expand from the left boundary node. Using StatefulShortestPath(Into)
would require five bidirectional breadthfirst search (BFS) invocations,
whereas StatefulShortestPath(All)
would require only one unidirectional BFS invocation.
As a result, the query will use StatefulShortestPath(All)
.
StatefulShortestPath(All)
PROFILE
MATCH
p = SHORTEST 1 (a:Station {name: "Worcestershire Parkway"})(()[]()[]()){1,}(b:Station)
RETURN p
+++++++++++  Operator  Id  Details  Estimated Rows  Rows  DB Hits  Memory (Bytes)  Page Cache Hits/Misses  Time (ms)  Pipeline  +++++++++++  +ProduceResults  0  p  5  9  122  0  0/0  10.967     +++++++++   +Projection  1  (a) ((anon_12)[anon_14](anon_13)[anon_11]())* (b) AS p  5  9  0   0/0  0.063     +++++++++   +StatefulShortestPath(All)  2  SHORTEST 1 (a) ((`anon_5`)[`anon_6`](`anon_7`)[`anon_8`](`anon_9`)){1, } (b)  5  9  80  18927  0/0  1.071  In Pipeline 1      expanding from: a             inlined predicates: b:Station           ++++++++++  +Filter  3  a.name = $autostring_0  1  1  18        +++++++     +NodeByLabelScan  4  a:Station  10  9  10  376  3/0  0.811  Fused in Pipeline 0  +++++++++++
However, the heuristic to favor StatefulShortestPath(All)
can lead to worse query performance.
To have the planner choose the StatefulShortestPath(Into)
instead, rewrite the query using a CALL
subquery, which will execute once for each incoming row.
For example, in the below query, using a CALL
subquery ensures that the planner binds a
and b
to exactly one Station
node respectively for each executed row, and this forces it to use StatefulShortestPath(Into)
for each invocation of the CALL
subquery, since a precondition of using this operator is that both boundary nodes match exactly one node each.
StatefulShortestPath(Into)
PROFILE
MATCH
(a:Station {name: "Worcestershire Parkway"}),
(b:Station)
CALL {
WITH a, b
MATCH
p = SHORTEST 1 (a)(()[]()[]()){1,}(b)
RETURN p
}
RETURN p
+++++++++++  Operator  Id  Details  Estimated Rows  Rows  DB Hits  Memory (Bytes)  Page Cache Hits/Misses  Time (ms)  Pipeline  +++++++++++  +ProduceResults  0  p  5  9  122  0  0/0  0.561     +++++++++   +Projection  1  (a) ((anon_12)[anon_14](anon_13)[anon_11]())* (b) AS p  5  9  0   0/0  0.060     +++++++++   +StatefulShortestPath(Into)  2  SHORTEST 1 (a) ((`anon_5`)[`anon_6`](`anon_7`)[`anon_8`](`anon_9`)){1, } (b)  5  9  176  17873  0/0  2.273  In Pipeline 3    ++++++++++  +CartesianProduct  3   5  9  0  2056  0/0  0.048  In Pipeline 2   \ ++++++++++   +NodeByLabelScan  4  b:Station  10  9  10  392  1/0  0.023  In Pipeline 1    ++++++++++  +Filter  5  a.name = $autostring_0  1  1  18        +++++++     +NodeByLabelScan  6  a:Station  10  9  10  376  3/0  0.089  Fused in Pipeline 0  +++++++++++
Sometimes the planner cannot make reliable estimations about how many nodes a pattern node will match. Consider using a uniqueness constraint where applicable to help the planner get more reliable estimates. 