Financial Fraud Detection with Graph Data Science: Identifying First-Party Fraud

Financial fraud is growing and it is a costly problem, estimated at 6% of the Global Domestic Product, more than $5 trillion in 2019.

Despite using increasingly sophisticated fraud detection tools – often tapping into AI and machine learning – businesses lose more and more money to fraudulent schemes every year. Graph data science helps turn this pattern around.

By augmenting existing analytics and machine learning pipelines, a graph data science approach increases the accuracy and viability of existing fraud detection methods. The end result: Fewer fraudulent transactions and safer revenue streams.

In this blog series, we are taking a closer look at how your data science and fraud investigation teams can tap into the power of graph technology for detecting first-party fraud as well as sophisticated fraud rings.

In this third blog of our four-part series, we discuss how to use graph feature engineering to improve fraud detection, and outline a methodology for finding first-party fraud using graph technology.

Next week, in the final blog of this series, we will wrap up with an overview for identifying fraud rings with graph technology.

Improving Fraud Detection with Graph Feature Engineering

Here are two examples of improving fraud detection using graph feature engineering: one for finding first-party and synthetic fraud and another for identifying fraud rings.

Example: Identifying First-Party Fraud

In first-party fraud, an individual (or group of people) misrepresents their identity or gives false information when applying for a financial product or service.

According to McKinsey, the fastest-growing type of first-party fraud is synthetic identity fraud. In synthetic identity fraud, the fraudster usually combines fake and real information to establish a credit record under a new, synthetic identity. This type of fraud results in major losses for financial institutions; an estimated 80% of all credit card fraud losses stem from synthetic identity fraud.

Organizations seeking to find fraud frequently have voluminous data to aid them in supporting investigations. However, with relevant data dispersed across relational database tables, data lakes and object storage, following the breadcrumbs across all of this data is arduous and time-consuming.

5 Steps for Finding First-Party Fraud using Graph Technology

The steps below are just one example. Your approach will vary depending on your goals and the data itself.

  1. Create a graph of relationships of information about individuals. Connect all available information: account IDs, user names, account numbers, names, IP addresses, social media accounts, email addresses, identification numbers, mailing addresses, dates of birth and so on.
  2. Consult with a fraud investigator to define what to look for. For example, consider:
    • Common attributes (same email address or phone number, for example)
    • Multiple parties using the same account
    • Short paths between transactions (a rapid return of a purchase with no support call or reason given, for example)
  3. Run graph queries on these attributes or use similarity algorithms – like Common Neighbors – to investigate and then run community detection algorithms such as Weakly Connected Components (also called Union Find) to quickly look for disconnected islands of activity or Louvain Modularity to find groups that interact more with each other than the rest of the graph or network. Write the results back to your graph and notify investigators.
  4. Investigators then use Neo4j Bloom to visually explore results and verify first-party fraud. Then they collect information to support a coordinated, rapid shutdown of anyone involved.
  5. A more advanced approach is to convert graph algorithm scores into features to add to your machine learning model so that you identify more fraud faster and shut it down sooner.


As we have shown in this third blog in our four-part series on fraud detection with graph data science, there are numerous steps your company can take to week out first-party fraud. Graph technology is proving to be the most effective tool to identify and defend against fraud.

Next week, in the final blog of this four-part series, we will show eight steps to identifying fraud rings with graph technology.

Discover how organizations are adding graph data science to their machine learning pipelines to find more fraud. Click below to get your copy of Financial Fraud Detection with Graph Data Science.

Get My White Paper