Supply Chain (Pharma) Demo
Introduction
Supply chains are among the most complex and regulated systems in the world. They span raw material suppliers, manufacturers, logistics providers, and distributors — all tightly interwoven and dependent on reliable, traceable flows. Traditional systems often struggle to provide the visibility needed to manage disruptions, quality issues, or counterfeit risks. Linear reporting and siloed data make it difficult to trace product lineage or respond in real time.
With Neo4j, you can leverage a native graph database with flexible graph model to map the full lifecycle of products — from raw materials to finished goods — and gain actionable insights into supply paths, dependencies, and vulnerabilities.
This setup guide shows how Neo4j can model and analyze pharmaceutical supply chains using a graph-native approach. You’ll explore key supply chain dimensions such as:
-
Supplier and distributor relationships
-
Batch traceability and genealogy
-
Demand back-propagation
-
Bottleneck and risk identification
-
Equipment utilization and optimization
You’ll also learn how to set up a Neo4j AuraDB instance, import a sample dataset, explore a supply chain graph model, and run queries to uncover structure, flow, and risk in your supply network.
Prerequisites
To run these examples, you will need the following:
Following the instructions in this demo will replace the data in your instance, so be sure to back up any data you do not want to lose; alternatively, you can create a fresh instance to use (recommended). |
-
(Optional, but strongly recommended) a Python installation in which you can run Jupyter Notebooks. The queries for this demonstration are all provided in Notebooks; you can also copy-and-paste the queries into the Query or Explore tools in the Aura console.
-
(Optional, but recommended) git client software to download the demo assets.
-
Optional: a local setup of Cypher Workbench, if you want to experiment with tools for editing the data model.
Setting Up the Database
1. Make sure you have a Neo4j AuraDB instance running. If you’re new to AuraDB, create an account at https://concole.neo4j.io and click Create Instance.
Be sure to save your credentials — you’ll need them to connect to your database later. Wait until the instance status shows “RUNNING” before moving to the next step. |
2. Clone the git repository from the GitHub repo You can do this with the following command:
git clone https://github.com/neo4j-product-examples/demo-supply_chain.git
Alternatively, you can use the "Download ZIP" option on the GitHub repo to download a copy.
3. Use .backup file to load data to the database. Using the “…3 dots” menu in the Aura console, select Backup & Restore.

Use either the Browse button or drag-and-drop to locate the |
4. Review the warning about replacing your instance data and proceed when you are ready:

5. You are ready to run the examples when your database instance reaches the “RUNNING” state.
6. Ensure you have the following Python libraries installed:
-
python-dotenv
-
neo4j
-
neo4j-tools
-
neo4j-viz
You can install them using one of the following options:
pip install python-dotenv neo4j neo4j-tools neo4j-viz
pip install -r requirements.txt
python -m venv env
source env/bin/activate # On Windows: env\Scripts\activate
pip install -r requirements.txt
7. In the src directory of your git working copy, make a copy of the sc_p.env.template file, name it “sc_p.env” and edit the file to include the URI and password for your AuraDB instance.
Understanding the Graph Data Model

This graph model maps the full lifecycle of a pharmaceutical product by connecting key entities and their relationships. It captures the structure and flow of data needed to analyze production, traceability, and distribution in a connected, end-to-end view.
/* Run this query to view the schema or graph model of the dataset */
CALL db.schema.visualization();
Key Entities
-
Suppliers: Provide raw materials and active ingredients.
-
Raw Materials (RM): The base substances required for initial formulation.
-
Active Pharmaceutical Ingredients (APIs): Core biologically active components.
-
Product Stages: BULK → DP (Drug Product) → Finished Drug Product.
-
Batches: Tracked units of materials or products used throughout manufacturing.
-
Equipment: Machines and lines involved in each production step.
-
Distributors: Entities responsible for delivering finished drugs to pharmacies or healthcare providers.
Key Relationships
-
SUPPLIES_RM
: Shows how suppliers provide raw materials to the manufacturing process. -
PRODUCT_FLOW
: Tracks how ingredients and intermediates move across production stages. -
DISTRIBUTED_BY
: Captures how finished products are handed off to downstream distributors.
What You Can Analyze
-
Dependency analysis: Trace upstream and downstream connections across suppliers, products, and materials.
-
Bottleneck detection: Surface slow points, single-source risks, and underutilized equipment.
-
Traceability and compliance: Map batch lineage for audit or quality control.
-
Supply chain visibility: Understand the complete path from raw inputs to distributed pharmaceuticals.
Use this model as the foundation for queries to explore:
|
Dependency Chain

This view illustrates the full dependency chain for a given pharmaceutical product SKU. The query traces how a product flows through each stage of the supply chain — starting from Suppliers of raw materials, through Raw Materials (RM) and Active Pharmaceutical Ingredients (APIs), into various stages of Drug Product manufacturing (bulk → DP → finished goods), and finally to Distributors.
By visualizing the supply chain as a graph, you can:
-
Trace lineage of any product: Identify every input — suppliers, materials, intermediates — that contributes to a finished drug.
-
Reveal interdependencies: Understand which raw materials or suppliers are used across multiple product lines.
-
Detect vulnerabilities: Spot single points of failure, such as over-reliance on a specific supplier or bottlenecks in production.
-
Explore both upstream and downstream: View impact in either direction — e.g., how a supplier delay affects downstream distribution, or which suppliers are involved in a batch that failed quality control.
This graph-based approach enables real-time, intuitive exploration that would be difficult or impossible to replicate using traditional relational joins.
By focusing on a specific productSKU
, this query helps supply chain teams pinpoint the exact flow of materials and relationships involved in that product’s manufacturing and distribution lifecycle.
/*
Finds the full supply chain path for a specific product SKU —
from supplier to distributor, across all production stages.
*/
MATCH path =
(sup:Suppliers) // Supplier
-[:SUPPLIES_RM]->
(rm:RM) // Raw Material
-[:PRODUCT_FLOW*]->
(prod:Product) // Product stages (API, DP, FG)
-[:DISTRIBUTED_BY]->
(dist:Distributor) // Distributor
WHERE prod.productSKU = '7e882292-ae98-45eb-8119-596b5d8b73e1' // Filter by SKU
RETURN
nodes(path) AS nodes,
relationships(path) AS relationships
Understanding Raw Materials Demand
Identify Raw Materials with Limited Supplier Redundancy
This query surfaces Raw Materials (RMs) that are supplied by only one supplier, highlighting potential bottlenecks or risks due to lack of supplier redundancy.
It works by:
- Counting how many suppliers are connected to each RM
- Filtering to include only RMs where supplierCount = 1
This pattern helps stakeholders proactively detect single points of failure in the procurement pipeline — especially important in pharmaceutical manufacturing, where delays or shortages in a single raw material can disrupt the entire product flow.
Showcases a |
/*
Finds Raw Materials (RMs) with only one supplier —
potential risk points due to lack of redundancy.
*/
MATCH (rm:RM)<-[:SUPPLIES_RM]-(sup:Suppliers) // Match each RM to its suppliers
WITH rm, COUNT(sup) AS supplierCount // Count number of suppliers per RM
WHERE supplierCount = 1 // Keep only RMs with a single supplier
RETURN
rm.productSKU AS rawMaterialSKU,
rm.globalBrand AS rawMaterialName,
supplierCount // Should always be 1 in this case
ORDER BY supplierCount
Map End-Point Demand to Drive Raw Material Planning
This query maps downstream distributor demand to specific product configurations and serves as the starting point for reverse-engineering raw material requirements. By analyzing demand from a target market (e.g., EU) for a specific drug like Calciiarottecarin (50mg Caplet, g2), teams can trace upstream dependencies — including APIs and raw materials — to inform accurate sourcing and production planning.
MATCH p = (dist:Distributor WHERE dist.market = "EU")
<-[db:DISTRIBUTED_BY]-
(prod:Product
WHERE prod.globalBrand = "Calciiarottecarin"
AND prod.strength = "50mg"
AND prod.form = "Caplet"
AND prod.generation = "g2")
<-[pf:PRODUCT_FLOW]-(:Product)
// Trace demand for a specific drug from a specific market
RETURN prod.globalBrand AS globalBrand,
dist.market AS market,
dist.location AS distributor,
prod.form AS form,
prod.strength AS strength,
prod.package AS package,
db.demandQty AS demandQty,
prod.productSKU AS productSKU
// Output key demand attributes used to drive raw material planning
ORDER BY demandQty DESC
Supply Chain Optimization
Identifies critical risks and inefficiencies across the supply chain—such as shared-resource APIs, single-supplier bottlenecks, material requirements from distributor demand, and redundant or circular logistics paths.
Find APIs Used in Multiple Drug Products with Potential Supply Risk
This query identifies Active Pharmaceutical Ingredients (APIs) that are used across multiple Drug Products. APIs with broader usage may pose a supply risk — disruptions can affect several products at once, especially when demand overlaps across product lines. To highlight more critical cases, the query filters for APIs linked to more than 4 distinct Drug Products.
This pattern highlights a |
/*
Finds APIs used in more than 4 different Drug Products (DP),
indicating potential supply risk due to shared dependency.
*/
MATCH (api:API)-[:PRODUCT_FLOW]->(dp:DP) // Match API to its connected Drug Products
WITH api, COUNT(dp) AS productCount // Count how many DPs each API is used in
WHERE productCount > 4 // Focus on APIs used in multiple products
RETURN
api.productSKU AS apiSKU,
api.globalBrand AS apiName,
productCount // Number of associated Drug Products
ORDER BY productCount DESC
Flag APIs with High Impact and Low Redundancy
This query identifies critical supply chain vulnerabilities by focusing on Active Pharmaceutical Ingredients (APIs) that meet two high-risk conditions:
-
They are used in multiple Drug Products, signaling high reliance across the portfolio.
-
They are supplied by only one supplier, creating a potential single point of failure.
By combining these criteria, the query surfaces APIs with both broad product impact and low sourcing redundancy — a key risk indicator in pharmaceutical manufacturing.
This insight helps stakeholders proactively flag bottlenecks, strengthen sourcing strategies, and reduce the risk of production disruptions.
Showcases a |
/*
Identifies APIs used in more than 4 Drug Products.
Flags those that are supplied by only one supplier — potential bottlenecks.
*/
MATCH (sup:Suppliers)-[:SUPPLIES_RM]->(rm:RM)
-[:PRODUCT_FLOW]->(api:API)
-[:PRODUCT_FLOW]->(dp:DP) // Trace path: Supplier → RM → API → Drug Product
WITH
api,
COUNT(dp) AS productCount, // How many DPs use this API
COLLECT(DISTINCT dp) AS dpList, // Optional: full list of DPs
COUNT(DISTINCT sup) AS supplierCount, // How many unique suppliers supply this API
COLLECT(DISTINCT sup.companyName) AS supplierList
WHERE productCount > 4 // Filter: API must be used in 4+ products
RETURN
api.productSKU AS apiSKU, api.globalBrand AS apiName,
productCount, supplierCount,supplierList,
CASE
WHEN supplierCount = 1
THEN 'Single Supplier Bottleneck!!' // Flag risky APIs
ELSE 'Multiple Suppliers Available'
END AS SupplierRisk
ORDER BY supplierCount ASC // Show most constrained APIs first
How does Distributor Demand Flow Back to Raw Material requirements?
This query answers a key supply chain question: "What quantity of raw materials is needed to fulfill demand at the distributor level?"
It works by tracing demand backward through the supply chain — from the distributor, all the way to the raw materials and their suppliers.
Step-by-step breakdown
-
Start with a specific product being requested by a distributor (
productSKU
) -
Capture the demand quantity from that distributor
-
Find the raw materials (RMs) used in making that product based on shared product attributes (e.g., brand, strength, form, generation)
-
Trace the shortest path from raw materials back to the product (through stages like API, BULK, DP, etc.)
-
Identify suppliers responsible for each raw material
-
Calculate how much raw material is required, factoring in conversion ratios at each step
Uses a |
/*
Trace demand from a distributor back to raw material quantity requirements.
*/
MATCH (d:Distributor)<-[db:DISTRIBUTED_BY]-(prod:Product)
WHERE prod.productSKU = '9a6b431f-3a38-4b45-9451-fbf39b2e2fd0' // Start with a specific product
MATCH (api)<-[pf2:PRODUCT_FLOW]-(rm:RM)
WHERE pf2.globalBrand = prod.globalBrand
AND pf2.strength = prod.strength
AND pf2.form = prod.form
AND pf2.generation = prod.generation // Match equivalent raw materials by product attributes
WITH d, db.demandQty AS demandQty, db, prod, COLLECT(DISTINCT rm) AS RMList
UNWIND RMList AS curRM // Process each matching raw material
MATCH p = shortestPath((prod)<-[pf1:PRODUCT_FLOW*]-(curRM)) // Trace shortest path from RM to product
MATCH p3 = (myProd:Product)<-[pf:PRODUCT_FLOW]-(curRM)<-[:SUPPLIES_RM]-(sup:Suppliers)
WHERE pf.globalBrand = prod.globalBrand
AND pf.strength = prod.strength
AND pf.form = prod.form
AND pf.generation = prod.generation
AND pf.market = d.market // Match supplier relationships in the same market
RETURN
sup.companyName AS supplierName,
curRM.productSKU AS rawMaterialSKU,
apoc.coll.disjunction(["Product"], labels(myProd))[0] AS usedBy, // Identify final product stage (API, DP, etc.)
demandQty,
REDUCE(rmQty = demandQty, rel IN relationships(p) |
TOINTEGER(ROUND(rmQty / (COALESCE(rel.conversionRatio, 1.0)), 0))
) AS rawMaterialQty
ORDER BY usedBy, supplierName, rawMaterialSKU
Detect Cross-Border or Cyclic Shipments in FG → DIST Flows
This query analyzes the movement of Finished Goods (FG) across the supply chain to detect potential shipment inefficiencies or logistical anomalies. It traces product flow from Finished Drug Products (FG) to Distribution (DIST) and highlights two key patterns:
Cross-Border Shipment
-
The shipment starts and ends in the same country
-
But passes through one or more different countries in between
-
Suggests unnecessary border crossings or inefficient routing ⚠️
Cyclic Movement
-
The shipment path forms a loop
-
It returns to the original location after multiple hops
-
Potential loop detected in product flow — indicates circular or redundant logistics ⚠️
Why It Matters
Identifying these patterns helps supply chain teams: - Reduce unnecessary shipping costs - Improve route efficiency - Mitigate risks tied to overly complex or non-optimal logistics networks
/*
Detects either Cross-Border or Cyclic shipment patterns
in Finished Goods (FG) to Distributor (DIST) product flows.
*/
MATCH (fg:FG:Product WHERE fg.globalBrand = "Calciiarottecarin") // Start with FG products for a specific brand
WHERE NOT EXISTS {
MATCH (fg)-[pf:PRODUCT_FLOW WHERE pf.globalBrand = "Calciiarottecarin"
AND pf.generation = "g2"
AND pf.form = "Caplet" // Filter: generation, form and strength of the drug
AND pf.strength = "50mg"
]->(:FG) // Ensure it's not part of a chained FG → FG flow
}
WITH fg, count(*) AS num
MATCH p = (fg)-[pf:PRODUCT_FLOW WHERE pf.globalBrand = "Calciiarottecarin"
AND pf.generation = "g2"
AND pf.form = "Caplet"
AND pf.strength = "50mg"
]->+ (dist:DIST:Product WHERE dist.globalBrand = fg.globalBrand
AND dist.generation = fg.generation
AND dist.strength = fg.strength
AND dist.form = fg.form)
WHERE EXISTS {
MATCH (dist)-[:DISTRIBUTED_BY]->(:Distributor) // Confirm product ends with a Distributor
}
WITH fg,
REDUCE(loc = [], x IN nodes(p)[0..-1] | loc + [split(x.location, "/")[1]]) AS countryList,
REDUCE(loc = [], x IN nodes(p)[0..-1] | loc + [x.location]) AS locationList,
REDUCE(loc = [], x IN nodes(p) | loc + [x.location]) AS fullLocList
WITH fg, countryList, locationList, fullLocList,
apoc.coll.dropDuplicateNeighbors(apoc.coll.sort(countryList)) AS dedupCountryList,
apoc.coll.dropDuplicateNeighbors(apoc.coll.sort(locationList)) AS dedupLocationList
WHERE
(countryList[0] = countryList[-1] AND size(dedupCountryList) > 1) // Cross-Border condition
OR
(locationList[0] = locationList[-1] AND size(dedupLocationList) > 1) // Cyclic condition
WITH fg, fullLocList,
CASE
WHEN countryList[0] = countryList[-1] AND size(dedupCountryList) > 1
THEN "Cross Border Shipment"
WHEN locationList[0] = locationList[-1] AND size(dedupLocationList) > 1
THEN "Cyclic Movement"
ELSE null
END AS costType
RETURN DISTINCT
costType,
fg.form AS form,
fg.generation AS gen,
fg.strength AS strength,
fullLocList AS LocationPath
ORDER BY fg.generation, fg.strength
You can load a full set of pre-saved Cypher queries into the Neo4j Aura Query workspace. Download the ![]() |
Dashboards (using NeoDash)
Neo4j Dashboards provide an interactive view of pharmaceutical supply chains, helping leaders explore critical areas like demand, bottlenecks, traceability, and equipment usage—all in one place.
Prep work
-
Go to https://neodash.graphapp.io/ and click on New Dashboard
-
Create the New Dashboard.
-
Connect to the database created in Step 1: Database setup
-
Click on left arrow at the bottom to expand the left pane
-
Click on the + button and import the JSON file located in the
src
folder of the GitHub repository.-
Direct link to the file: dashboard-supplychain.json
-

-
You should see the Dashboard

Dashboard Tabs
The Supply Chain tab offers a high-level view of global brands, markets, and distribution. For example, selecting the drug Calciiarottecarin (50mg caplet) shows enriched demand in the EU market, with West Europe as the top distributor. From here, you can drill down into its full product flow to assess upstream dependencies and potential risks.
Each tab focuses on a key dimension:
-
The RM Demand tab calculates how much raw material is needed to fulfill demand for a selected product. It traces supply chain paths and aggregates quantities instantly—making complex demand propagation simple and scalable.
-
The SC Optimization tab helps identify costly shipping patterns and delays in processing. It surfaces cross-border inefficiencies and highlights stages that exceed target durations—so teams can quickly pinpoint and address bottlenecks.
-
The Batch Traceability tab helps trace defective batches back through the supply chain—revealing shared equipment, operators, and potential contamination points. It combines Neo4j’s rich relationship modeling with GenAI to highlight commonalities and root causes for fast, explainable investigation.
-
The Equipment Utilization tab highlights underused equipment across production. It helps identify rescheduling opportunities to boost usage, avoid unnecessary procurement, and plan maintenance—thanks to Neo4j’s flexible schema for modeling process and equipment sequences.