Globalization and outsourcing were the main drivers for increasing the complexity of supply chains. At the same time, natural catastrophes as well as economic, social and ethical aspects drove the importance of having a good overview and understanding of the entire supply chain rather than just focusing on your direct suppliers and distributors.

In order to model and understand supply chains better, more and more sources refer to modern supply chains as supply networks. Therefore, it looks like a perfect environment to figure out how Neo4j can help us mastering the supply chain network challenge.

Initial Data Model

For the sake of simplicity, every node has the same following attributes, lat = latitude, and lon = longitude.

Figure 1. Initial Data Model

We categorize our suppliers into RawSupplierA and SupplierA for fresh products and RawSupplierB and SupplierB for durable commodities. The rest is straight forward. The distribution is through wholesaler and retailer.

Figure 2. Distribution through wholesaler and Retailer

Supply chains are inherently complex and can be modeled and clustered in several different ways. For the sake of understandability, we will keep it simple and neglect a lot of things, which would be essential in a real world application.

A real-world example

This supply chain could be a good example for soft drink supply chain. Every participant in the chain is a sample commodity or entity.

Figure 3. A real world example

Challenges of supply chain management

The biggest challenge in supply chain management is the inherent complexity of modern supply chains. Therefore, none of these examples is sufficient enough for decision-making. There are many additional questions to answer, which are beyond of the scope of this little scenario.

Connect the Graph

Add the distance between every Supplier

Adding the distance between connected nodes is based on the longitude and latitude.

Case: Find best Wholesaler

Let’s start off with a good old transportation problem: Find the Wholesaler with the least accumulated distance to every retailer. Thanks to Cypher, this can be done very easily.

MATCH    (p:Product)-[r1]->(w)-[r2]->(re:Retailer)
WITH     distinct(substring(, 10)) AS Num,
         toInteger(avg( + AS Average_Distance,
         toInteger(sum( + AS Total_Distance
RETURN   "Wholesaler" + Num AS Wholesaler, Total_Distance, Average_Distance
ORDER BY Total_Distance

Case: We want it fresh

Let’s assume we want to guarantee that all fresh ingredients in our drink are not older than seven days.

MATCH chain=(rs:RawSupplierA)-[*]->(re:Retailer)
WITH reduce(wait = 0, s IN nodes(chain)| wait + s.time) AS waitTime, chain
WHERE waitTime < 8
WITH [n IN nodes(chain)|] AS SupplyChain, waitTime
ORDER BY SupplyChain[1]
RETURN SupplyChain, waitTime

HINT: Almost all values are based on random value generation. In case the table is empty, simply reload the graph.

Case: It’s time to put it together

Let’s assume: we don’t only want a fresh product. Additionally, we want it locally. Therefore, we want to make sure that our product needs less than 8 days and travels less than 23000km from the farmer to the shelve in a grocery store.

MATCH    chain=(rs:RawSupplierA)-[*]->(re:Retailer)
WITH     reduce(wait = 0, s IN nodes(chain)| wait + s.time) AS waitTime, chain
WHERE    waitTime < 8
WITH     reduce(dist = 0, s IN relationships(chain)| dist + AS distance, waitTime, chain
WHERE    distance < 23000
WITH     [n IN nodes(chain)|] AS SupplyChain
RETURN   collect(distinct(SupplyChain[1])) AS Supplier, collect(distinct(SupplyChain[0])) AS RawSupplier

Here we want to know which RawSupplier and Supplier can guarantee this promise. Of course, we would have to specify the exact path through the network, in order to fulfill the promise of being local.

Case: Find the Top "Sample" Supply Chain within the Supply Chain Network

We define 'sample' supply chain as having one participant for every processing step in the supply chain. The 'top' simply means to find the chain with best rating. Please keep in mind, that we isolate and rate every 'sample' supply chain and don’t evaluate the entire supply chain at once. We compare every possible supply chain in terms of cost, time and waste. The comparison is based on a weighted score.

Total score = (cost 60%) + (waste 20%) + (time 20%)

Total score can be used as a KPI and eases complex decision-making and quick comparison of values of a different nature. Furthermore, this could be very useful to examine other members of the supply chain and take the measurements as tangible goals for improving these members or monitoring the entire supply chain. The total score also comes in handy in case we want to diminish the number of our (raw)supplier and only retain the top performer.

MATCH supplier_chainA=(p:Product)<--(:SupplierA)<--(rsA:RawSupplierA)
MATCH supplier_chainB=(rsB:RawSupplierB)-->(:SupplierB)-->(p)
MATCH retailer_chain=(p)-->(:Wholesaler)-->(re:Retailer)
	round(reduce(wait = 0, s IN nodes(supplier_chainA)| wait + 2*s.timeR/10 + 6*s.costR/10 + 2*s.wasteR/10) +
	reduce(wait = 0, s IN nodes(supplier_chainB)| wait + 2*s.timeR/10 + 6*s.costR/10 + 2*s.wasteR/10) +
    reduce(wait = 0, s IN nodes(retailer_chain)| wait + 2*s.timeR/10 + 6*s.costR/10 + 2*s.wasteR/10)) as totalScore,
    [n IN nodes(supplier_chainA)|] + [n IN nodes(supplier_chainB)|] + [n IN nodes(retailer_chain)|] AS SupplyChain
ORDER BY totalScore ASC


  • Due to the nature of supply chains, which is inherently a graph or network structure, graph databases are more suitable to monitor, maintain and model supply chain problems e.g. Risk Management, Bullwhip-Effect, Transport Optimization, quality assurance. . .

  • In combination with RFID chips and could computing, graph database technology offers a broad variety of applications for real-time monitoring and process improvement

For ideas, critique or question feel free to contact me on LinkedIn: