Gilead Sciences Combats $431 billion Pharmaceutical Fraud Threat with Neo4j Graph Analytics
Global Pharmaceutical Leader Deploys Neo4j AuraDB on AWS to Protect Patient Safety and Preserve Program Integrity
Faster fraud pattern detection compared to relational databases
Improvement in detection rates using graph analytics
Reduction in false positives

Above: Gilead Sciences Headquarters in Foster City, CA
Gilead Sciences uses advanced analytics to support investigators in identifying and understanding complex risks associated with pharmaceutical fraud, product diversion, and program misuse. Within the Global Product Security and Anti‑Counterfeiting function, relationship‑based data models are operational as a secure, enterprise‑grade graph and AI analytics platform, complementing established analytical and investigative methods.
Thousands of counterfeit bottles line the shelves of a locked war room at Gilead Sciences headquarters. These pills and prescription medications are confiscated evidence in an invisible war against fraud that now represents an industry-wide $431 billion global threat.
“When we’re preventing fraud, it’s because criminals are defrauding programs meant to benefit folks who can’t afford medication,” explains Thomas Luu, Director of Advanced Data Analytics under Global Product Security at Gilead. “Everything goes back to patient safety. When fraud happens, that’s who it affects the most.”
Counterfeiting is never just a financial crime. The human cost becomes devastatingly clear when patients receive antipsychotic medication instead of Gilead’s life-saving HIV drugs, or bottles filled with rocks instead of medicine.
“This work is fundamentally about patient safety and program integrity,” says Luu. “Our goal is to understand complex activity patterns more clearly so investigators can ask better questions and build defensible narratives.”
This document highlights the implementation and evolution of an Advanced Analytics platform supported by AI infrastructure with Neo4J, reflecting an industry‑leading approach to applying graph technology and artificial intelligence within pharmaceutical product security.
The Analytical Challenge
Pharmaceutical fraud and program misuse often involve multiple actors, data sources, and intermediaries operating across jurisdictions. Investigations require analysts and legal teams to review and correlate information spanning:
- Prescription and claims activity
- Pharmacy and prescriber records
- Program utilization data
- Geographic and temporal indicators
“These data sets weren’t built to talk to each other,” Luu explains. “You can learn something from each one independently, but the real insight comes from understanding how they relate.”
When analyzed in isolation, these datasets may obscure broader relationship patterns that are critical to understanding complex activity and supporting investigative decision‑making.
Limits of Traditional Review Methods
Spreadsheet‑based analysis and conventional relational queries remain appropriate for many compliance and operational tasks. As investigations increase in complexity, however, these approaches can become difficult to scale—particularly when analysts must reconcile identifiers across systems, track indirect relationships, or explain how activity evolves across entities and time.
“I was often the bottleneck,” Luu notes. “The analysis depended on deep familiarity with both the data and the business context, which made it hard to scale or transfer knowledge.”
Graph‑Based and AI‑Supported Analytics in Practice
To address these challenges, Gilead implemented a relationship‑oriented analytics approach that uses graph technology and AI‑supported analytics to assist human‑led investigations.
The platform enables investigators to:
- Make connections between entities explicit
- Visually explore complex networks
- Examine relationships without relying solely on predefined rules
“What excites us is giving investigators a clearer view of how complex activities connect,” says Luu. “Seeing those relationships directly makes it much easier to explain findings and build defensible cases.”
These capabilities are designed to support investigative reasoning, improve explainability, and complement existing controls, legal review processes, and compliance frameworks.
How graph databases are reshaping science and society
Learn how graph databases help us explore space, cure rare diseases, increase crop yields, and much more.
Cloud‑Based Operations
As part of broader cloud data initiatives, Gilead integrated this graph‑ and AI‑supported analytics platform into its existing cloud environment. The platform operates alongside established data pipelines and analytics tools, extending Gilead’s investigative capabilities without disrupting upstream systems.
This approach allows new data sources to be incorporated as investigative needs evolve, while maintaining alignment with enterprise security, governance, and compliance requirements.
Unifying Data Sources with Neo4j AuraDB
A core component of the platform is Neo4j AuraDB, which serves as the graph analytics layer used to unify and analyze highly interconnected data at scale.
Neo4j AuraDB enables Gilead to model entities and their relationships directly—treating relationships as first‑class elements rather than secondary joins. This allows investigators to explore how prescribers, pharmacies, programs, transactions, and locations connect across multiple datasets that were previously analyzed independently.
“Previously, we could only look at one data set at a time,” Luu explains. “The folks in operations processing rebates weren’t looking for fraud. But the fraud signals become clear when you can see sales declining while claims from the same pharmacy are increasing.”
With Neo4j AuraDB, investigators can traverse networks, examine relationship patterns, and build contextual views of activity that would be difficult to surface using traditional row‑and‑column approaches alone. The flexible graph model supports the addition of new data sources and relationship types as investigative questions evolve.
Another breakthrough came through entity resolution: the ability to match records across systems despite inconsistent spellings and deliberately obscured identities. A prescriber might appear as “Dr. John Smith” in one system, “J. Smith, MD” in another, and “John A. Smith” in a third. Traditional databases would treat these as separate entities yet Neo4j users can implement string similarity functions like Levenshtein distance, Jaro-Winkler, or soundex to identify likely matches and link these entities together.
The system now processes prescription patterns, pharmacy relationships, physician networks, and patient locations as interconnected entities rather than isolated data points. When a criminal network spans multiple states with dozens of participants, the graph can trace these relationships through the entire network. Geographic data becomes particularly powerful. The system can immediately flag when a physician’s patients all fill prescriptions at a single pharmacy, or when supposed local patients are scattered across multiple states for routine medications that should be prescribed locally.
Within the broader Advanced Analytics platform, Neo4j AuraDB operates alongside existing cloud analytics and AI‑supported tooling, enhancing relationship‑oriented analysis while preserving established data architecture, security controls, and governance practices.
What the Platform Supports
Through graph‑based analytics and AI‑assisted insights, the platform supports investigators by:
- Visualizing clusters of related entities
- Adding geographic and temporal context to activity patterns
- Supporting clear, explainable investigative narratives for internal and legal review
“The value is in clarity,” Luu explains. “When you can clearly articulate how entities relate, it makes internal review and legal discussions much more straightforward.”
These capabilities inform investigative workflows but do not independently determine outcomes or replace established compliance processes.
Agentic Fraud Detection: Human‑Led, AI‑Augmented Investigations
Building on its graph analytics foundation, Gilead is advancing toward an agentic fraud detection model in which AI‑driven agents assist investigators throughout the lifecycle of an investigation while keeping humans firmly in control.
Rather than automating enforcement decisions, this approach focuses on using agent‑based workflows to:
- Continuously monitor emerging relationship patterns across the graph
- Surface entities, networks, or behaviors that warrant investigator review
- Assemble contextual views—including related entities, activity history, and network structure—to accelerate case development
In this model, graph analytics provide the structural backbone, while agentic AI components help prioritize signals, navigate complex networks, and guide analysts toward the most relevant areas of investigation. Investigators retain decision‑making authority, using AI‑generated context to support judgment, documentation, and defensible outcomes.
Data flows from multiple sources within Gilead’s AWS ecosystem through Starburst for processing before loading into Neo4j AuraDB. This hybrid approach lets Gilead keep their existing AWS data infrastructure while adding purpose-built graph capabilities. The Neo4j Connector for Apache Spark enables efficient data movement between AWS services and the graph database. Amazon Location Service provides geocoding capabilities that enhance fraud detection algorithms.
The architecture supports real-time analysis through Neo4j Bloom’s visualization interface and batch processing through Apache Spark integration. Databricks notebooks provide the development environment for data scientists to build and refine graph algorithms, while the Unity Catalog ensures data governance across the entire pipeline.
Behind the scenes, Gilead deploys Graph Data Science algorithms including GraphSage, Louvain community detection, and PageRank centrality measures. When investigators need to trace a suspicious network, they’re working with algorithms designed specifically for relationship analysis rather than forcing relational databases to perform tasks they weren’t built for.
By combining Neo4j‑based relationship intelligence with AI‑assisted agents, Gilead is laying the groundwork for a scalable investigative framework—one that adapts as fraud tactics evolve, while maintaining transparency, explainability, and alignment with legal and compliance requirements.
On Metrics, Performance, and Outcomes
Gilead does not publicly report specific performance metrics such as detection rate changes, false‑positive reductions, or labor equivalency gains. While such measures are often discussed in industry contexts, this document focuses on platform capabilities and investigative support rather than quantified outcomes.
Forward‑Looking Considerations
As the platform continues to mature, Gilead will assess additional ways that graph analytics and AI‑supported techniques can further enhance:
- Transparency of investigative logic
- Cross‑dataset understanding
- Analyst usability and explainability
The urgency of this evolution is reinforced by broader industry trends. With counterfeiting incidents increasing 35‑fold since 2002 and current anti‑counterfeiting measures estimated to be only approximately 50% effective according to PwC research, traditional approaches are increasingly strained by the scale and sophistication of modern fraud networks.
As Luu observes, “All of the future AI is going to be based upon this model because it’s the only framework that will be fast enough to keep up.”
Key Takeaway
Gilead Sciences has implemented an industry‑leading, graph‑based and AI‑supported Advanced Analytics platform, with Neo4j AuraDB as a core component, to support patient safety, program integrity, and legally defensible investigations. Gilead’s fraud graph implementation represents just the beginning of a larger transformation. Luu envisions an agentic AI approach where investigators can simply ask, “What are the ten most likely fraudulent entities in the last three months?” and receive case packages including supporting documentation, transaction histories, and relationship maps.
“We’re democratizing the data,” Luu explains. “Investigators who have no idea how to work with data will be able to see connections clearly. The platform brings together complex, interconnected data in ways that strengthen investigative clarity while remaining aligned with enterprise governance and compliance requirements.”
Breakthroughs: How graph databases are reshaping science and society
Find hidden relationships in your data to unlock groundbreaking outcomes.
