The Most Powerful Fraud Prevention Tool for Federal Agencies: Graph Technology

John Bender

RVP, Federal Sales, Neo4j

Fraud costs the United States government a staggering amount of money every year—between $233 and $521 billion, according to the U.S. Government Accountability Office (GAO), with tax, claims, and contract fraud among the most costly schemes. Public-facing agencies that handle direct payments and sensitive information—the Centers for Medicare & Medicaid Services (CMS) and the Internal Revenue Service (IRS), for example—are frequent targets.

Fraud also evolves at a dizzying pace. Fraud rings continue to grow larger and more global, with specialists in multiple countries contributing different skills to the same network. Experts worry about the rise of Fraud-as-a-Service—when cybercriminal organizations offer tools, services, and support to individual fraudsters or smaller fraud rings. Fraud schemes increasingly blend real, stolen, and fabricated information to create synthetic identities, then use AI systems to help launch presentation or other attacks aimed at gaining unauthorized access to sensitive networks.

But here’s the silver lining: As fraud rings evolve and expand, they leave an ever-larger digital footprint—a network of digital communication and exchange. Fundamentally, fraud is a network problem. It involves complex networks of hidden connections: people, accounts, devices, and transactions linked in subtle ways, often across continents and time zones. To protect themselves against fraud, federal agencies need to uncover these hidden connections and analyze the networks they form.  

Unfortunately, many fraud-detection systems rely on relational database technology, which struggles to effectively model and analyze complex relationships in large, dynamic datasets. Graph technology, on the other hand, excels at representing and querying these kinds of relationships, regardless of dataset size or the number of degrees of separation between related entities. That makes it a perfect solution for sophisticated and costly fraud schemes.

A Technology Built to Expose Modern Fraud Schemes

Relational databases don’t model relationships explicitly. They store data in tables, with keys indicating relationships between tables. When users query related data, the database must reconstruct relationships via computationally demanding JOIN operations. Fraud networks tend to be densely interconnected, and the more interconnected your data, the more JOIN operations you need. The resulting queries can take minutes or hours to complete—if they complete at all.

In graph databases, relationships between data points, or entities, are modeled and stored explicitly, as part of the database structure. This makes navigating the relationships in a query up to 1,000 times faster. If you want to surface every connection a known fraudster has to every suspected fraudster in your database, for example, you can do so easily using a graph query language like Cypher.

One Tool Unmasks Many Schemes

Consider how graph database technology can help federal agencies like CMS and the U.S. Department of Veterans Affairs (VA) expose claims fraud. Fraudsters often create multiple fake accounts, each with its own ID number, and use them to file illegitimate claims. Sometimes they use the same device or IP address for all the accounts—a clear red flag. To see that flag, however, an agency would have to model the network of entities involved: names, addresses, phone numbers, ID numbers, devices, IP addresses, etc. In a relational database, this would require many slow JOIN operations—too slow, probably, to be practical. A graph database already contains these key relationships, and a quick Cypher query would immediately expose the connection between a single IP address or device ID and many new accounts.

Another red flag is a single “medical pathway” across multiple claims—i.e., a set of claims that all involve the same medical practice, doctor, pharmacy, and so on. As in the example above, you can’t see this without modeling the many connections between the entities in the network, from claimants and claims to medical service providers. Capturing these connections is often prohibitively difficult in a relational database, but graph databases are designed to make the process simple and efficient—exposing the kinds of patterns and relationships that reveal claims fraud.

Federal agencies use graph technology to fight fraud in many ways, from tax investigators finding connections between companies within one suspect LLC to IT analysts modeling a network to detect cyberattacks in real time.

How Graph Algorithms and ML Accelerate Fraud Detection

Advanced graph algorithms offer even more analytical power and flexibility than Cypher queries do, dramatically accelerating fraud detection efforts for federal agencies. The more subtle and complex the fraud, the more valuable graph algorithms become—and because they replace time-consuming custom queries, they free up valuable data science resources, which can be particularly helpful for public organizations operating within fixed budgets.

Federal agencies can use advanced algorithms for the following fraud detection and prevention use cases, among others:

  • Fraud ring identification: Sophisticated fraudsters often work together, operating methodically, sharing information, and collaborating to commit fraud. Clustering algorithms help identify strongly connected groups likely to be fraud rings.
  • Network analysis: Organizations can use graph algorithms to analyze the topology of transaction networks, identifying unusual behaviors, detecting anomalies, and revealing relationships that point to fraudulent activities. 
  • Entity resolution: In large datasets, the same entity may appear with variations in names, addresses, or other attributes. Graph algorithms can reconcile these variations and create a unified view of the entity by linking related records (e.g., accounts, transactions, and individuals).

Using Graph Algorithms to Expose Contract Fraud

Federal agencies tasked with assessing bids for government contracts often encounter contract fraud. One common example involves multiple entities making fraudulent bids—bids designed to fail while creating the impression of real competition. Those entities may then receive a kickback from the entity that wins the bid. 

When we graph this kind of bidding process—with, for example, multiple contracts and three bidders—we can use an algorithm called Weakly Connected Components to show how one company can win many contracts, while two other bidders, despite submitting numerous bids, win nothing. In the diagram below, the orange relationships (arrows) represent awarded contracts, and the gray relationships represent submitted bids.

Such an uneven distribution of contacts signals that the bidding process may be fraudulent and warrants further investigation. 

ML and the Power of Fraud Prediction

Federal agencies that use machine learning (ML) for fraud detection can dramatically improve its effectiveness by pairing it with a graph database. Graph technology allows agencies to rapidly explore and analyze suspicious entities and fraud patterns that other data models obscure—and once they’ve identified common fraud patterns, they can train ML models to recognize them and predict fraud accordingly. This enables:

  • Proactive detection: Federal agencies can train a model to identify fraudulent actors ahead of time and better identify suspicious new communities that aren’t connected to known fraudsters.
  • Measurable performance: Supervised learning models produce clear performance metrics that enable agencies to evaluate and adjust as needed.
  • Automation: Supervised ML automates fraud risk prediction for suspicious accounts or claims.

Key Anti-Fraud Benefits for Federal Agencies

As we’ve seen, graph technology offers a host of capabilities that can transform the fraud-prevention efforts of federal agencies. Stepping back a bit, we can identify a few key ways the technology helps agencies address the fraud risks they face.

Create a 360-Degree View of Connections

Graph structure allows agencies to unify disparate data—e.g., personal information, transactions, communications, geolocation—to create a complete picture of individuals and organizations and the relationships between them. Agencies can use graph analytics to uncover relationships across systems and departments, for example, or visualize behavior over time and across communication channels.

Uncover Hidden Networks

Graph technology helps agencies discover subtle, oblique connections between entities, including:

  • Interconnected accounts, members of fraud rings, and the organizations behind shell companies
  • Entities that form meaningful clusters, identified with community detection algorithms
  • Suspicious entities with two or more degrees of separation between them

Adapt to New Patterns Fast

Graphs have flexible schemas, so when agencies encounter new fraud methods, they can easily add data to their graph—no need to modify the data model. In a rapidly changing fraud landscape, when analysts frequently need to add new device ID numbers or social media handles, this flexibility is incredibly important. Agencies can also easily define new fraud patterns using Cypher queries.

Generate Predictive Insights

By combining graph technology with ML, agencies can detect anomalies in behavior or transaction paths, use graph-based features in ML models to predict fraud likelihood, and flag suspicious entities for deeper investigation—before damage is done.

Improve Interagency Collaboration

Federal agencies often operate in silos, compromising their ability to generate a comprehensive picture of fraudulent activities, but graph databases make it easy to share contextual intelligence. If the same phone number is used to commit both immigration and tax fraud, for example, graph technology can highlight the shared data. Graphs also enable faster case resolution through shared visual insights.

Getting Started With Graph-Based Fraud Prevention

If you’re ready to explore graph-based fraud prevention, you can try Neo4j Aura for free and immediately start identifying fraud patterns in your organizational data.

For more technical guidance, check out this blog series, which shows you how to apply graph technology to a fraud detection workflow, including fraud pattern analysis and prediction.

And if you’d like to keep learning about fraud detection and graph technology, take a look at these resources: