7.5. The A* algorithm

This section describes the A* algorithm in the Neo4j Graph Algorithms library.

The A* (pronounced “A-star”) algorithm improves on the classic Dijkstra algorithm. It is based upon the observation that some searches are informed, and that by being informed we can make better choices over which paths to take through the graph.

This section includes:

7.5.1. History and explanation

The A* algorithm was first described in 1968 by Peter Hart, Nils Nilsson, and Bertram Raphael. For more information, see A Formal Basis for the Heuristic Determination of Minimum Cost Paths.

In A*, we split the path cost into two parts:

g(n)
This is the cost of the path from the starting point to some node n.
h(n)
This represents the estimated cost of the path from the node n to the destination node, as computed by a heuristic (an intelligent guess).

The A* algorithm balances g(n) and h(n) as it iterates the graph, thereby ensuring that at each iteration it chooses the node with the lowest overall cost f(n) = g(n) + h(n).

In our implementation, geospatial distance is used as heurestic.

7.5.2. Use-cases - when to use the A* algorithm

  • The A* algorithm can be used to find shortest paths between single pairs of locations, where GPS coordinates are known.

7.5.3. A* algorithm sample

The following will create a sample graph: 

MERGE (a:Station{name:"King's Cross St. Pancras"})
SET a.latitude = 51.5308,a.longitude = -0.1238
MERGE (b:Station{name:"Euston"})
SET b.latitude = 51.5282, b.longitude = -0.1337
MERGE (c:Station{name:"Camden Town"})
SET c.latitude = 51.5392, c.longitude = -0.1426
MERGE (d:Station{name:"Mornington Crescent"})
SET d.latitude = 51.5342, d.longitude = -0.1387
MERGE (e:Station{name:"Kentish Town"})
SET e.latitude = 51.5507, e.longitude = -0.1402
MERGE (a)-[:CONNECTION{time:2}]->(b)
MERGE (b)-[:CONNECTION{time:3}]->(c)
MERGE (b)-[:CONNECTION{time:2}]->(d)
MERGE (d)-[:CONNECTION{time:2}]->(c)
MERGE (c)-[:CONNECTION{time:2}]->(e);

The following will run the algorithm and stream results: 

MATCH (start:Station{name:"King's Cross St. Pancras"}),(end:Station{name:"Kentish Town"})
CALL algo.shortestPath.astar.stream(start, end, 'time', 'latitude', 'longitude', {defaultValue:1.0})
YIELD nodeId, cost
RETURN algo.asNode(nodeId).name as station,cost

Table 7.19. Results
Name Cost

King’s Cross St. Pancras

0

Euston

2

Camden Town

5

Kentish Town

7

7.5.4. Cypher projection

If label and relationship-type are not selective enough to describe your subgraph to run the algorithm on, you can use Cypher statements to load or project subsets of your graph. This can also be used to run algorithms on a virtual graph. You can learn more in the Section 2.2, “Cypher projection” section of the manual.

Set graph:'cypher' in the config: 

MATCH (start:Station{name:"King's Cross St. Pancras"}),(end:Station{name:"Kentish Town"})
CALL algo.shortestPath.astar.stream(start, end, 'time','latitude','longitude',{
nodeQuery:'MATCH (p:Station) RETURN id(p) as id',
relationshipQuery:'MATCH (p1:Station)-[r:CONNECTION]->(p2:Station) RETURN id(p1) as source, id(p2) as target,r.time as weight',
graph:'cypher'})
YIELD nodeId, cost
RETURN nodeId,cost

7.5.5. Syntax

The following will run the algorithm and stream results: 

CALL algo.shortestPath.astar.stream((startNode:Node, endNode:Node, weightProperty:String, propertyKeyLat:String, propertyKeyLon:String,
    {nodeQuery:'labelName', relationshipQuery:'relationshipName', direction:'BOTH', defaultValue:1.0})
YIELD nodeId, cost

Table 7.20. Parameters
Name Type Default Optional Description

startNode

node

null

no

The start node

endNode

node

null

no

The end node

weightProperty

string

null

yes

The property name that contains weight

propertyKeyLat

string

null

no

The property name that contains latitude coordinate

propertyKeyLon

string

null

no

The property name that contains longitude coordinate

nodeQuery

string

null

yes

The label to load from the graph. If null, load all nodes

relationshipQuery

string

null

yes

The relationship-type to load from the graph. If null, load all nodes

defaultValue

float

null

yes

The default value of the weight in case it is missing or invalid

direction

string

outgoing

yes

The relationship direction to load from the graph. If 'both', treats the relationships as undirected

Table 7.21. Results
Name Type Description

nodeId

int

Node ID

cost

int

The cost it takes to get from start node to specific node

7.5.6. Graph type support

The Shortest Path algorithms support the following graph types:

  • ✓ directed, unweighted:

    • direction: 'OUTGOING' or INCOMING, weightProperty: null
  • ✓ directed, weighted

    • direction: 'OUTGOING' or INCOMING, weightProperty: 'cost'
  • ✓ undirected, unweighted

    • direction: 'BOTH', weightProperty: null
  • ✓ undirected, weighted

    • direction: 'BOTH', weightProperty: 'cost'

7.5.7. Implementations

algo.shortestPath.astar.stream()

  • Implementation of A* heuristic function is for geospatial distances.