Supply Chain (Pharma) Demo

Introduction

Supply chains are among the most complex and regulated systems in the world. They span raw material suppliers, manufacturers, logistics providers, and distributors — all tightly interwoven and dependent on reliable, traceable flows. Traditional systems often struggle to provide the visibility needed to manage disruptions, quality issues, or counterfeit risks. Linear reporting and siloed data make it difficult to trace product lineage or respond in real time.

With Neo4j, you can leverage a native graph database with flexible graph model to map the full lifecycle of products — from raw materials to finished goods — and gain actionable insights into supply paths, dependencies, and vulnerabilities.

This setup guide shows how Neo4j can model and analyze pharmaceutical supply chains using a graph-native approach. You’ll explore key supply chain dimensions such as:

Supplier and distributor relationships
Batch traceability and genealogy
Demand back-propagation
Bottleneck and risk identification
Equipment utilization and optimization

You’ll also learn how to set up a Neo4j AuraDB instance, import a sample dataset, explore a supply chain graph model, and run queries to uncover structure, flow, and risk in your supply network.

Prerequisites
Setting Up the Database
The Graph Data Model
Dependency Chain
Understanding Raw Materials Demand
Supply Chain Optimization
Dashboards (NeoDash)
Resources

Prerequisites

To run these examples, you will need the following:

A Neo4j AuraDB database instance. These examples will run on any tier, Trial and paid tiers like Pro, Business Critical and VDC. You can sign up for AuraDB here.

Following the instructions in this demo will replace the data in your instance, so be sure to back up any data you do not want to lose; alternatively, you can create a fresh instance to use (recommended).

(Optional, but strongly recommended) a Python installation in which you can run Jupyter Notebooks. The queries for this demonstration are all provided in Notebooks; you can also copy-and-paste the queries into the Query or Explore tools in the Aura console.
(Optional, but recommended) git client software to download the demo assets.
Optional: a local setup of Cypher Workbench, if you want to experiment with tools for editing the data model.

Setting Up the Database

1. Make sure you have a Neo4j AuraDB instance running. If you’re new to AuraDB, create an account at https://concole.neo4j.io and click Create Instance.

Be sure to save your credentials — you’ll need them to connect to your database later. Wait until the instance status shows “RUNNING” before moving to the next step.

2. Clone the git repository from the GitHub repo You can do this with the following command:

git clone https://github.com/neo4j-product-examples/demo-supply_chain.git

Alternatively, you can use the "Download ZIP" option on the GitHub repo to download a copy.

3. Use .backup file to load data to the database. Using the “…3 dots” menu in the Aura console, select Backup & Restore.

Use either the Browse button or drag-and-drop to locate the .backup file. You can find it in the dump directory of the repo you cloned in step 2, or download it here.

4. Review the warning about replacing your instance data and proceed when you are ready:

5. You are ready to run the examples when your database instance reaches the “RUNNING” state.

6. Ensure you have the following Python libraries installed:

python-dotenv
neo4j
neo4j-tools
neo4j-viz

You can install them using one of the following options:

pip install python-dotenv neo4j neo4j-tools neo4j-viz
pip install -r requirements.txt
python -m venv env
source env/bin/activate  # On Windows: env\Scripts\activate
pip install -r requirements.txt

7. In the src directory of your git working copy, make a copy of the sc_p.env.template file, name it “sc_p.env” and edit the file to include the URI and password for your AuraDB instance.

Understanding the Graph Data Model

This graph model maps the full lifecycle of a pharmaceutical product by connecting key entities and their relationships. It captures the structure and flow of data needed to analyze production, traceability, and distribution in a connected, end-to-end view.

/* Run this query to view the schema or graph model of the dataset */
CALL db.schema.visualization();

Key Entities

Suppliers: Provide raw materials and active ingredients.
Raw Materials (RM): The base substances required for initial formulation.
Active Pharmaceutical Ingredients (APIs): Core biologically active components.
Product Stages: BULK → DP (Drug Product) → Finished Drug Product.
Batches: Tracked units of materials or products used throughout manufacturing.
Equipment: Machines and lines involved in each production step.
Distributors: Entities responsible for delivering finished drugs to pharmacies or healthcare providers.

Key Relationships

SUPPLIES_RM: Shows how suppliers provide raw materials to the manufacturing process.
PRODUCT_FLOW: Tracks how ingredients and intermediates move across production stages.
DISTRIBUTED_BY: Captures how finished products are handed off to downstream distributors.

What You Can Analyze

Dependency analysis: Trace upstream and downstream connections across suppliers, products, and materials.
Bottleneck detection: Surface slow points, single-source risks, and underutilized equipment.
Traceability and compliance: Map batch lineage for audit or quality control.
Supply chain visibility: Understand the complete path from raw inputs to distributed pharmaceuticals.

Use this model as the foundation for queries to explore:

Which suppliers support critical APIs?
What happens if a batch is recalled?
Where are single points of failure?
How much raw material is needed to fulfill distributor demand?
Which APIs are used across multiple drug products?
Do any high-impact APIs rely on a single supplier?
Are there redundant or cross-border shipment paths?
Can we trace a product’s full lineage from distributor to supplier? .. more

Dependency Chain

This view illustrates the full dependency chain for a given pharmaceutical product SKU. The query traces how a product flows through each stage of the supply chain — starting from Suppliers of raw materials, through Raw Materials (RM) and Active Pharmaceutical Ingredients (APIs), into various stages of Drug Product manufacturing (bulk → DP → finished goods), and finally to Distributors.

By visualizing the supply chain as a graph, you can:

Trace lineage of any product: Identify every input — suppliers, materials, intermediates — that contributes to a finished drug.
Reveal interdependencies: Understand which raw materials or suppliers are used across multiple product lines.
Detect vulnerabilities: Spot single points of failure, such as over-reliance on a specific supplier or bottlenecks in production.
Explore both upstream and downstream: View impact in either direction — e.g., how a supplier delay affects downstream distribution, or which suppliers are involved in a batch that failed quality control.

This graph-based approach enables real-time, intuitive exploration that would be difficult or impossible to replicate using traditional relational joins.

By focusing on a specific productSKU, this query helps supply chain teams pinpoint the exact flow of materials and relationships involved in that product’s manufacturing and distribution lifecycle.

/*
  Finds the full supply chain path for a specific product SKU —
  from supplier to distributor, across all production stages.
*/
MATCH path =
  (sup:Suppliers)            // Supplier
    -[:SUPPLIES_RM]->
  (rm:RM)                    // Raw Material
    -[:PRODUCT_FLOW*]->
  (prod:Product)             // Product stages (API, DP, FG)
    -[:DISTRIBUTED_BY]->
  (dist:Distributor)         // Distributor
WHERE prod.productSKU = '7e882292-ae98-45eb-8119-596b5d8b73e1'  // Filter by SKU
RETURN
  nodes(path) AS nodes,
  relationships(path) AS relationships

Understanding Raw Materials Demand

Identify Raw Materials with Limited Supplier Redundancy

This query surfaces Raw Materials (RMs) that are supplied by only one supplier, highlighting potential bottlenecks or risks due to lack of supplier redundancy.

It works by: - Counting how many suppliers are connected to each RM - Filtering to include only RMs where supplierCount = 1

This pattern helps stakeholders proactively detect single points of failure in the procurement pipeline — especially important in pharmaceutical manufacturing, where delays or shortages in a single raw material can disrupt the entire product flow.

Showcases a dependency pattern focused on Supplier → RM relationships, enabling smarter sourcing strategies and improved supply chain resilience.

/*
  Finds Raw Materials (RMs) with only one supplier —
  potential risk points due to lack of redundancy.
*/
MATCH (rm:RM)<-[:SUPPLIES_RM]-(sup:Suppliers)  // Match each RM to its suppliers
WITH rm, COUNT(sup) AS supplierCount            // Count number of suppliers per RM
WHERE supplierCount = 1                         // Keep only RMs with a single supplier

RETURN
  rm.productSKU AS rawMaterialSKU,
  rm.globalBrand AS rawMaterialName,
  supplierCount                                 // Should always be 1 in this case
ORDER BY supplierCount

Map End-Point Demand to Drive Raw Material Planning

This query maps downstream distributor demand to specific product configurations and serves as the starting point for reverse-engineering raw material requirements. By analyzing demand from a target market (e.g., EU) for a specific drug like Calciiarottecarin (50mg Caplet, g2), teams can trace upstream dependencies — including APIs and raw materials — to inform accurate sourcing and production planning.

MATCH p = (dist:Distributor WHERE dist.market = "EU")
          <-[db:DISTRIBUTED_BY]-
          (prod:Product
            WHERE prod.globalBrand = "Calciiarottecarin"
              AND prod.strength = "50mg"
              AND prod.form = "Caplet"
              AND prod.generation = "g2")
          <-[pf:PRODUCT_FLOW]-(:Product)
// Trace demand for a specific drug from a specific market

RETURN prod.globalBrand AS globalBrand,
       dist.market AS market,
       dist.location AS distributor,
       prod.form AS form,
       prod.strength AS strength,
       prod.package AS package,
       db.demandQty AS demandQty,
       prod.productSKU AS productSKU
// Output key demand attributes used to drive raw material planning

ORDER BY demandQty DESC

Supply Chain Optimization

Identifies critical risks and inefficiencies across the supply chain—such as shared-resource APIs, single-supplier bottlenecks, material requirements from distributor demand, and redundant or circular logistics paths.

Find APIs Used in Multiple Drug Products with Potential Supply Risk

This query identifies Active Pharmaceutical Ingredients (APIs) that are used across multiple Drug Products. APIs with broader usage may pose a supply risk — disruptions can affect several products at once, especially when demand overlaps across product lines. To highlight more critical cases, the query filters for APIs linked to more than 4 distinct Drug Products.

This pattern highlights a Drug Product ↔ API dependency, helping teams identify shared-resource risks and prioritize mitigation strategies.

/*
  Finds APIs used in more than 4 different Drug Products (DP),
  indicating potential supply risk due to shared dependency.
*/
MATCH (api:API)-[:PRODUCT_FLOW]->(dp:DP)     // Match API to its connected Drug Products
WITH api, COUNT(dp) AS productCount           // Count how many DPs each API is used in
WHERE productCount > 4                        // Focus on APIs used in multiple products

RETURN
  api.productSKU AS apiSKU,
  api.globalBrand AS apiName,
  productCount                                // Number of associated Drug Products

ORDER BY productCount DESC

Flag APIs with High Impact and Low Redundancy

This query identifies critical supply chain vulnerabilities by focusing on Active Pharmaceutical Ingredients (APIs) that meet two high-risk conditions:

They are used in multiple Drug Products, signaling high reliance across the portfolio.
They are supplied by only one supplier, creating a potential single point of failure.

By combining these criteria, the query surfaces APIs with both broad product impact and low sourcing redundancy — a key risk indicator in pharmaceutical manufacturing.

This insight helps stakeholders proactively flag bottlenecks, strengthen sourcing strategies, and reduce the risk of production disruptions.

Showcases a dependency pattern centered on API ↔ Drug Product and Supplier ↔ API relationships.

/*
  Identifies APIs used in more than 4 Drug Products.
  Flags those that are supplied by only one supplier — potential bottlenecks.
*/
MATCH (sup:Suppliers)-[:SUPPLIES_RM]->(rm:RM)
      -[:PRODUCT_FLOW]->(api:API)
      -[:PRODUCT_FLOW]->(dp:DP)                // Trace path: Supplier → RM → API → Drug Product

WITH
  api,
  COUNT(dp) AS productCount,                   // How many DPs use this API
  COLLECT(DISTINCT dp) AS dpList,              // Optional: full list of DPs
  COUNT(DISTINCT sup) AS supplierCount,        // How many unique suppliers supply this API
  COLLECT(DISTINCT sup.companyName) AS supplierList

WHERE productCount > 4                         // Filter: API must be used in 4+ products

RETURN
  api.productSKU AS apiSKU, api.globalBrand AS apiName,
  productCount, supplierCount,supplierList,
  CASE
    WHEN supplierCount = 1
    THEN 'Single Supplier Bottleneck!!'     // Flag risky APIs
    ELSE 'Multiple Suppliers Available'
  END AS SupplierRisk

ORDER BY supplierCount ASC                     // Show most constrained APIs first

How does Distributor Demand Flow Back to Raw Material requirements?

This query answers a key supply chain question: "What quantity of raw materials is needed to fulfill demand at the distributor level?"

It works by tracing demand backward through the supply chain — from the distributor, all the way to the raw materials and their suppliers.

Step-by-step breakdown

Start with a specific product being requested by a distributor (productSKU)
Capture the demand quantity from that distributor
Find the raw materials (RMs) used in making that product based on shared product attributes (e.g., brand, strength, form, generation)
Trace the shortest path from raw materials back to the product (through stages like API, BULK, DP, etc.)
Identify suppliers responsible for each raw material
Calculate how much raw material is required, factoring in conversion ratios at each step

Uses a shortestPath pattern to model reverse flow and material requirements based on end-point demand.

/*
  Trace demand from a distributor back to raw material quantity requirements.
*/
MATCH (d:Distributor)<-[db:DISTRIBUTED_BY]-(prod:Product)
WHERE prod.productSKU = '9a6b431f-3a38-4b45-9451-fbf39b2e2fd0'  // Start with a specific product

MATCH (api)<-[pf2:PRODUCT_FLOW]-(rm:RM)
WHERE pf2.globalBrand = prod.globalBrand
  AND pf2.strength = prod.strength
  AND pf2.form = prod.form
  AND pf2.generation = prod.generation                 // Match equivalent raw materials by product attributes

WITH d, db.demandQty AS demandQty, db, prod, COLLECT(DISTINCT rm) AS RMList
UNWIND RMList AS curRM                                 // Process each matching raw material

MATCH p = shortestPath((prod)<-[pf1:PRODUCT_FLOW*]-(curRM))  // Trace shortest path from RM to product

MATCH p3 = (myProd:Product)<-[pf:PRODUCT_FLOW]-(curRM)<-[:SUPPLIES_RM]-(sup:Suppliers)
WHERE pf.globalBrand = prod.globalBrand
  AND pf.strength = prod.strength
  AND pf.form = prod.form
  AND pf.generation = prod.generation
  AND pf.market = d.market                              // Match supplier relationships in the same market

RETURN
  sup.companyName AS supplierName,
  curRM.productSKU AS rawMaterialSKU,
  apoc.coll.disjunction(["Product"], labels(myProd))[0] AS usedBy, // Identify final product stage (API, DP, etc.)
  demandQty,
  REDUCE(rmQty = demandQty, rel IN relationships(p) |
    TOINTEGER(ROUND(rmQty / (COALESCE(rel.conversionRatio, 1.0)), 0))
  ) AS rawMaterialQty

ORDER BY usedBy, supplierName, rawMaterialSKU

Detect Cross-Border or Cyclic Shipments in FG → DIST Flows

This query analyzes the movement of Finished Goods (FG) across the supply chain to detect potential shipment inefficiencies or logistical anomalies. It traces product flow from Finished Drug Products (FG) to Distribution (DIST) and highlights two key patterns:

Cross-Border Shipment

The shipment starts and ends in the same country
But passes through one or more different countries in between
Suggests unnecessary border crossings or inefficient routing ⚠️

Cyclic Movement

The shipment path forms a loop
It returns to the original location after multiple hops
Potential loop detected in product flow — indicates circular or redundant logistics ⚠️

Why It Matters

Identifying these patterns helps supply chain teams: - Reduce unnecessary shipping costs - Improve route efficiency - Mitigate risks tied to overly complex or non-optimal logistics networks

/*
  Detects either Cross-Border or Cyclic shipment patterns
  in Finished Goods (FG) to Distributor (DIST) product flows.
*/
MATCH (fg:FG:Product WHERE fg.globalBrand = "Calciiarottecarin")   // Start with FG products for a specific brand
WHERE NOT EXISTS {
  MATCH (fg)-[pf:PRODUCT_FLOW WHERE pf.globalBrand = "Calciiarottecarin"
    AND pf.generation = "g2"
    AND pf.form = "Caplet"                                // Filter: generation, form and strength of the drug
    AND pf.strength = "50mg"
  ]->(:FG)                                                // Ensure it's not part of a chained FG → FG flow
}

WITH fg, count(*) AS num

MATCH p = (fg)-[pf:PRODUCT_FLOW WHERE pf.globalBrand = "Calciiarottecarin"
  AND pf.generation = "g2"
  AND pf.form = "Caplet"
  AND pf.strength = "50mg"
]->+ (dist:DIST:Product WHERE dist.globalBrand = fg.globalBrand
  AND dist.generation = fg.generation
  AND dist.strength = fg.strength
  AND dist.form = fg.form)
WHERE EXISTS {
  MATCH (dist)-[:DISTRIBUTED_BY]->(:Distributor)                // Confirm product ends with a Distributor
}

WITH fg,
  REDUCE(loc = [], x IN nodes(p)[0..-1] | loc + [split(x.location, "/")[1]]) AS countryList,
  REDUCE(loc = [], x IN nodes(p)[0..-1] | loc + [x.location]) AS locationList,
  REDUCE(loc = [], x IN nodes(p) | loc + [x.location]) AS fullLocList

WITH fg, countryList, locationList, fullLocList,
  apoc.coll.dropDuplicateNeighbors(apoc.coll.sort(countryList)) AS dedupCountryList,
  apoc.coll.dropDuplicateNeighbors(apoc.coll.sort(locationList)) AS dedupLocationList

WHERE
  (countryList[0] = countryList[-1] AND size(dedupCountryList) > 1)   // Cross-Border condition
  OR
  (locationList[0] = locationList[-1] AND size(dedupLocationList) > 1) // Cyclic condition

WITH fg, fullLocList,
  CASE
    WHEN countryList[0] = countryList[-1] AND size(dedupCountryList) > 1
      THEN "Cross Border Shipment"
    WHEN locationList[0] = locationList[-1] AND size(dedupLocationList) > 1
      THEN "Cyclic Movement"
    ELSE null
  END AS costType

RETURN DISTINCT
  costType,
  fg.form AS form,
  fg.generation AS gen,
  fg.strength AS strength,
  fullLocList AS LocationPath

ORDER BY fg.generation, fg.strength

You can load a full set of pre-saved Cypher queries into the Neo4j Aura Query workspace.

Download the cypher_queries-saved.csv file from the src/ folder of the GitHub repository: demo-supply_chain GitHub Repo. Then upload it to the Saved Cypher section in Aura to access and run the demo queries.

Dashboards (using NeoDash)

Neo4j Dashboards provide an interactive view of pharmaceutical supply chains, helping leaders explore critical areas like demand, bottlenecks, traceability, and equipment usage—all in one place.

Prep work

Go to https://neodash.graphapp.io/ and click on New Dashboard
Click on Existing Dashboard as the dashboard is saved in the database backup.
Connect to the database created in Step 1: Database setup
Click on left arrow at the bottom to expand the left pane
Click on the + button and import the JSON file located in the src folder of the GitHub repository.
- Direct link to the file: dashboard-supplychain.json

You should see the Dashboard

Dashboard Tabs

The Supply Chain tab offers a high-level view of global brands, markets, and distribution. For example, selecting the drug Calciiarottecarin (50mg caplet) shows enriched demand in the EU market, with West Europe as the top distributor. From here, you can drill down into its full product flow to assess upstream dependencies and potential risks.

Each tab focuses on a key dimension:

The RM Demand tab calculates how much raw material is needed to fulfill demand for a selected product. It traces supply chain paths and aggregates quantities instantly—making complex demand propagation simple and scalable.
The SC Optimization tab helps identify costly shipping patterns and delays in processing. It surfaces cross-border inefficiencies and highlights stages that exceed target durations—so teams can quickly pinpoint and address bottlenecks.
The Batch Traceability tab helps trace defective batches back through the supply chain—revealing shared equipment, operators, and potential contamination points. It combines Neo4j’s rich relationship modeling with GenAI to highlight commonalities and root causes for fast, explainable investigation.
The Equipment Utilization tab highlights underused equipment across production. It helps identify rescheduling opportunities to boost usage, avoid unnecessary procurement, and plan maintenance—thanks to Neo4j’s flexible schema for modeling process and equipment sequences.

Resources

Now that you’ve seen how Neo4j can help analyze and optimize complex supply chain networks, here are some helpful resources and ideas to guide your next steps: