###### By Neo4j Staff | August 19, 2013

As you may know by now, I like beer. A lot – why else would I keep writing and talking about it? But there’s more to life than sweet beverages, and one of the things that I have been doing for as long as I can remember is Orienteering. I have been practicing the sport in Belgium since 1984 – I was 11 years old. My dad used to take me to races all across the continent – we truly had a blast. And we still do: I still orienteer almost every week, and so does my dad. Now I take my 8 and 10-year old kids with me to the races, and their granddad cheers them on every step of the way. It’s a fantastic family sport.

One of the reasons why it is so fantastic is that – Orienteering is a “thinking sport”. You have to concentrate to navigate. You have to run to have the best time (it’s a race), but if you run too fast, you are sure to make navigation mistakes. You have to find the balance between physical and mental fitness – which is hard but completely awesome when you succeed. And: it’s outdoors – in the woods and fields. What’s not to like?

So what does that have to do with neo4j? Well, orienteering is all about “

**”: the *fastest* route from start to finish. Fast can be short. Fast can also mean that it is better to take a detour: if it is easier to run the longer route, than to walk the shorter route, you are better off choosing the longer route. In essence, every orienteering race is … a graph problem waiting to be solved in the middle of nature.**__finding the shortest path__

## Orienteering = a green graph problem

In case you don’t know: orienteering races are a bit like an obstacle race. Every participant gets assigned a course, out there in the green forests and fields, and along that course are sequences of beacons that one needs to get to in order. Such a sequence is … a path on a graph – you have to choose how to navigate from obstacle to obstacle, from node to node.

Essentially, the orienteer has to navigate and choose the

**fastest**route. Finding the fastest route effectively this__boils down to a “weighted shortest path” calculation__. You calculate the shortest path using- distance: shorter = better
- runnability: higher = better. Runnability can be affected by the type of terrain (running on a road? through a field? through a forest? through a forest with soil covered with plants? over a hill? through a valley? …)

as your parameters. For every “leg” of the race, you estimate the presumable “best route” based on the assumption that distance / runnability will be the best indicator for your likely speed.

## Example: a 2 control race in Antwerp, Belgium

As you can see – the race assignment is a graph.

If we then look at every leg separately, you can see that for every leg, there are a number of route options.

The red route is the safe choice – running along the roads – but takes quite a detour. The blue route cuts straight across the field – but then requires me to go straight through the forest for a short distance. | The red route is the shortest – but requires me to run straight through the forest. The blue route just races along the forest road. | The red route just goes straight to the finish line. The blue route cuts through the forest and then follows the road. The green route safely hurries along the roads. |

So 3 controls, and different routes with different characteristics. As you can see in the schematic representations, every route has different “waypoints” – specific points of interest that I can identify on the map, and recognize in the “field”. These waypoints are extremely important for the navigation exercise that we are doing – they allow us to break the problem up in smaller pieces and evaluate our options.

Intuitively, all of us will have a “feeling” about what would be the best route choice, but now let’s use graph algorithms to do this for us!

## Graph database model to navigate

In order to apply a graph algorithm, we first need to create a graph. These are my nodes:

- Control nodes: the race beacons that I have to pass by
- The alternative route choices, decomposed in waypoints.

Then, let’s create the relationships between these nodes. We will have “COURSE_TO” relationships between controls, and “NAVIGATE_TO” relationships between waypoints. Effectively, these will become “paths” on my graph, hopping from node to node along the relationships.

- From the start to control 1: I have
**0->0.11->0.12->0.13->0.14->0.15**as one route and**0->0.21->0.22->1**as another route - From control 1 to control 2 I have
**1->1.11->1.12->1.13->2**as one route option, and**1->0.15->1.21->2**as another option. - From control 2 to the finish I have 3 options:
**2->2.11->3**,**2->2.21->2.22->3**and**2->1.21->2.31->2.32->3**.

As you can see, I have immediately added “distance” (in meters) and “runnability” (in %) properties to my relationships.

When I then generate a neo4j database using the spreadsheet method, I get a nice little database – ready to be queried and ready for my algorithms.

## Graphs algorithms to win the race!

In order to calculate the best route to win the race, I need to calculate the shortest path across the graph – which is standard functionality of neo4j. But because there’s more to it than running the course in straight lines between controls, I need to incorporate**weights**(distance, runnability) to get a realistic estimate of what would be the best route choice. To do so I am using a technique so well demonstrated by Ian Robinson on his blog last June.

Let’s go through two versions of this calculation:

- find the shortest path by distance only:

START startNode=node:node_auto_index(name=”Start”),

endNode=node:node_auto_index(name=”Finish”)

MATCH p=(startNode)-[:NAVIGATE_TO*]->(endNode)

RETURN p AS shortestPath,

reduce(distance=0, r in relationships(p) : distance+r.distance) AS totalDistance

ORDER BY totalDistance ASC

LIMIT 1;

- find the best estimate of the fastest path, as a function of distance/runnability. In a real race this would probably be the route that I would choose – as it would give me the best chance of winning the race.

To do this, we will be using ‘reduce’ to sum the distance divided by the runnability: longer distance with superior runnability, is possibly faster than shorter distance with lower runnability:

START startNode=node:node_auto_index(name=”Start”),

endNode=node:node_auto_index(name=”Finish”)

MATCH p=(startNode)-[:NAVIGATE_TO*]->(endNode)

RETURN p AS shortestPath,

reduce(EstimatedTime=0, r in relationships(p) : EstimatedTime+(r.distance/r.runnability)) AS TotalEstimatedTime

ORDER BY TotalEstimatedTime ASC

LIMIT 1;

As you can see, the first and second queries take the same (shortest) path for start to control 1 and from control 2 to finish, but recommends a clearly different path from control 1 to control 2 (following the forest road instead of cutting through the forest).

# Many applications for weighted shortest paths!

Obviously Orienteering is not a business application, but in logistics, planning, impact analysis and many other applications, weighted shortest path algorithms will have a great potential. Whether it is to find out how things are related to eachother, determining the most efficient way to get something from point A to point B, or finding out who would be affected by a particular type of capacity outage – the approach that I used for my orienteering problem would work just as nicely!

Keywords: graph algorithms • orienteering • shortest path • weighted shortest path

### 1 Comment

### Leave a Reply

### Related Posts

### Popular Graph Topics

- graphdb (37)
- release (29)
- community (24)
- cypher (22)
- graphconnect (13)
- neo4j (12)
- nosql (11)
- milestone (10)
- graph databases (10)
- graphgist (9)

### Archives

### Have a Graph Question?

Reach out and connect with the Neo4j staff.

Stackoverflow

Contact Us

Nice use of reduce.