GraphGist: Preview

by Nicole White


This GraphGist has not yet been submitted and approved for publication. If you're the developer, please submit for publication using the GraphGist Portal.

In my optimization class last semester we briefly talked about project management, where there is a set of activities with given durations and some activities need to be completed before other activities can begin. We were taught to explore the management of the project’s timeline in Excel, which was tedious and prone to errors due to its manual process.

The Project

In this example, an insurance company is wanting to build an office LAN. They will want to know the ins and outs of the project; i.e., the earliest start / finish time and the latest start / finish time of each activity, as these details play critical roles in the management of the project. Ultimately, they will want to know the slack time of each activity and the critical path, which will be defined soon. The activities and their durations are given below.

Label Activity Description Duration (Days)
A Perform needs analysis 10
B Develop specifications 6
C Select server 6
D Select software 12
E Select cables 4
F Purchase equipment 3
G Develop user manuals 6
H Wire offices 12
I Set up server 3
J Develop training program 14
K Install software 4
L Connect network 3
M Train users 8
N Test and debug system 12
O Get management acceptance 4

The dependencies of each activity are represented in this graph:

So, for example, activities C and D (select server and select software) need to be completed before activity G (develop user manuals) can begin.

Some Definitions

Project managers are interested in the following properties of each activity, which I’ll define briefly and then add to the graph.

Earliest Finish Time

The earliest time an activity \(j\) can finish, \(EF_j\), is the activity’s earliest start time, \(ES_j\), plus its duration, \(d_j\):

\(EF_j = ES_j + d_j\)

Earliest Start Time

Because an activity cannot begin until all of its immediate predecessor activities are completed, the earliest start time of an activity \(j\), \(ES_j\), is the maximum of the earliest finish times of its \(i\) immediate predecessor activities:

\(ES_j = max(EF_i)\)

Latest Start Time

The latest time an activity \(j\) can start without delaying the project is its latest finish time, \(LF_j\), minus its duration, \(d_j\):

\(LS_j = LF_j - d_j\)

Latest Finish Time

The latest an activity \(i\) can finish without delaying the project, \(LF_i\), is the minimum of the latest start times of its \(j\) immediate successor activities:

\(LF_i = min(LS_j)\)

Project Completion Time, Critical Path, & Slack

The total time needed to complete the project is the earliest start time of the Finish node:

\(Project Completion Time = ES_{Finish}\)

This is also the longest path through the graph (from start to finish) in terms of cumulative duration, where this cumulative duration is the project completion time. This longest path is the critical path. The critical path consists of activities where an increase in any of their durations would increase the overall project completion time. These activities have 0 slack time; there is 0 room for an increase in their durations without affecting the overall project completion time. An activity \(j\) not on the critical path will have a positive slack time, which is the difference between its latest and earliest start time:

\(slack_j = LS_j - ES_j\)

Activity \(j\)'s duration can be increased by \(slack_j\) without affecting the overall project completion time.


\(ES_{Start} = EF_{Start} = LS_{Start} = LF_{Start} = 0\)

\(ES_{Finish} = EF_{Finish} = LS_{Finish} = LF_{Finish} = Project Completion Time\)

Data Setup

CREATE  (Start:Activity {id:1, description:'Start', duration:0, earliest_start:0, earliest_finish:0, latest_start:0, latest_finish:0}),
		(A:Activity {id:2, description:'Perform needs analysis', duration:10}),
		(B:Activity {id:3, description:'Develop specifications', duration:6}),
		(C:Activity {id:4, description:'Select server', duration:6}),
		(D:Activity {id:5, description:'Select software', duration:12}),
		(E:Activity {id:6, description:'Select cables', duration:4}),
		(F:Activity {id:7, description:'Purchase equipment', duration:3}),
		(G:Activity {id:8, description:'Develop user manuals', duration:6}),
		(H:Activity {id:9, description:'Wire offices', duration:12}),
		(I:Activity {id:10, description:'Set up server', duration:3}),
		(J:Activity {id:11, description:'Develop training program', duration:14}),
		(K:Activity {id:12, description:'Install software', duration:4}),
		(L:Activity {id:13, description:'Connect network', duration:3}),
		(M:Activity {id:14, description:'Train users', duration:8}),
		(N:Activity {id:15, description:'Test and debug system', duration:12}),
		(O:Activity {id:16, description:'Get management acceptance', duration:4}),
		(Finish:Activity {id: 17, description:'Finish', duration:0})

CREATE  (Start)-[:PRECEDES]->(A),
Loading graph...

Exploring the Basics of the Project

Cypher can easily answer some basic questions about the project.

Immediate Dependencies of an Activity

Suppose we want to know the immediate predecessors of activity M (training the users):

MATCH p = (:Activity)-[:PRECEDES]->(:Activity {description:'Train users'})
Loading graph...

All Dependencies of an Activity

Suppose we want to know all the activities that need to be completed before activity G (develop user manuals) can begin:

MATCH p = (:Activity)-[:PRECEDES*]->(:Activity {description:'Develop user manuals'})
Loading graph...

Project Completion Time

The overall project completion time, as mentioned earlier, is the longest path from start to finish in terms of cumulative duration:

MATCH p = (:Activity {description:'Start'})-[:PRECEDES*]->(:Activity {description:'Finish'})
WITH p, REDUCE(x = 0, a IN NODES(p) | x + a.duration) AS cum_duration
ORDER BY cum_duration DESC
RETURN cum_duration AS `Project Completion Time`
Loading table...

The project will take 62 days to complete (given there are no delays).

Critical Path

MATCH p = (:Activity {description:'Start'})-[:PRECEDES*]->(:Activity {description:'Finish'})
WITH p, REDUCE(x = 0, a IN NODES(p) | x + a.duration) AS cum_duration
ORDER BY cum_duration DESC
Loading graph...

The durations of the activities shown in this critical path, if increased, would increase the overall project completion time. The project manager now knows which activities on his timeline are most sensitive to delays.

Add EF, ES, LS, LF, & Slack Times to the Graph

These insightful properties can be added to the graph easily with Cypher, which (in my opinion) is infinitely better than manually typing functions into several Excel cells.

Set Earliest Finish Times

Recall: \(EF_j = ES_j + d_j\)

MATCH p = (:Activity {description:'Start'})-[:PRECEDES*]->(j:Activity)
WITH j, MAX(REDUCE(x = 0, a IN NODES(p) | x + a.duration)) AS ef
SET j.earliest_finish = ef

Set Earliest Start Times

Recall: \(ES_j = max(EF_i)\)

MATCH (i:Activity)-[:PRECEDES]->(j:Activity)
WITH j, MAX(i.earliest_finish) AS max_ef
SET j.earliest_start = max_ef

Update Finish Node

We already found the overall project completion time by finding the longest path, but this property is also captured as the earliest start time of the finish node:

MATCH (f:Activity {description:'Finish'})
RETURN f.earliest_start AS `Project Completion Time`
Loading table...

We need to update the properties of the Finish node according to the insight shown earlier before we \'move backward\' through the graph to find the LS and LF times:

MATCH (f:Activity {description:'Finish'})
SET f.earliest_finish = f.earliest_start, f.latest_start = f.earliest_start, f.latest_finish = f.earliest_start

Set Latest Start Times

Recall: \(LS_j = LF_j - d_j\)

MATCH p = (j:Activity)-[:PRECEDES*]->(f:Activity {description:'Finish'})
WITH j, MIN(REDUCE(x = f.earliest_start, a IN NODES(p) | x - a.duration)) AS ls
SET j.latest_start = ls

Set Latest Finish Times

Recall: \(LF_i = min(LS_j)\)

MATCH (i:Activity)-[:PRECEDES]->(j:Activity)
WITH i, MIN(j.latest_start) AS min_ls
SET i.latest_finish = min_ls

Set Slack Times

Recall: \(slack_j = LS_j - ES_j\)

MATCH (a:Activity)
SET a.slack = a.latest_start - a.earliest_start

View Updated Graph

Loading graph...

View ES, EF, LS, LF, & Slack Times

MATCH (a:Activity)
RETURN a.description AS Activity, a.earliest_start AS `Earliest Start Time`, a.earliest_finish AS `Earliest Finish Time`, a.latest_start AS `Latest Start Time`, a.latest_finish AS `Latest Finish Time`, a.slack AS Slack
Loading table...

The slack times tell the project manager how many days each activity can be delayed beyond its earliest start time before affecting the overall project completion time. The activities with 0 slack are on the critical path, as they cannot be delayed; compare these activities to the list of activities from query 5.

Answering Important Questions

No project will run smoothly, so the project manager will want to know how various setbacks will affect the bottom line. These questions can all be answered by looking at the slack of an activity.

If setting up the server is delayed by two days, how will this affect the overall project completion time?

MATCH (a:Activity {description:'Set up server'})
RETURN a.slack AS Slack
Loading table...

A delay of two days in setting up the server will have no effect on the project completion time, since setting up the server has a slack of five days.

The guy who’s supposed to come in and train the users calls and says he’ll arrive three days after he initially promised. How will this affect the overall project completion time?

MATCH (a:Activity {description:'Train users'})
RETURN a.slack AS Slack
Loading table...

The overall project completion time will increase from 62 to 63 days (62 + (3 - 2)) since the delay of three days exceeds the two-day slack afforded to training the users by one day.