The Key Challenges in Fraud Detection:Between the enormous amounts of data available for analysis and today’s experienced fraud rings (and solo fraudsters), fraud detection professionals are beset with challenges. Here are some of their biggest:
- Complex link analysis to discover fraud patterns Uncovering fraud rings requires you to traverse data relationships with high computational complexity – a problem that’s exacerbated as a fraud ring grows.
- Detect and prevent fraud as it happens To prevent a fraud ring, you need real-time link analysis on an interconnected dataset, from the time a false account is created to when a fraudulent transaction occurs.
- Evolving and dynamic fraud rings Fraud rings are continuously growing in shape and size, and your application needs to detect these fraud patterns in this highly dynamic and emerging environment.
Overcoming Fraud Detection Challenges with Graph DatabasesWhile no fraud prevention measures are perfect, significant improvements occur when you look beyond individual data points to the connections that link them. Understanding the connections between data, and deriving meaning from these links, doesn’t necessarily mean gathering new data. You can draw significant insights from your existing data simply by reframing the problem in a new way: as a graph. Unlike most other ways of looking at data, graphs are designed to express relatedness. Graph databases uncover patterns that are difficult to detect using traditional representations such as tables. An increasing number of companies use graph databases to solve a variety of connected data problems, including fraud detection.
Example: E-commerce FraudAs our lives become increasingly digital, a growing number of financial transactions are conducted online. Fraudsters have adapted quickly to this trend and have devised clever ways to defraud online payment systems. While this type of activity can and does involve criminal rings, even a single well-informed fraudster can create a large number of synthetic identities and to carry out sizeable schemes. Consider an online transaction with the following identifiers: user ID, IP address, geo location, a tracking cookie and a credit card number. Typically, the relationships between these identifiers should be (almost) one-to-one. Some variations naturally account for shared machines, families sharing a single credit card number, individuals using multiple computers and the like. However, as soon as the relationships between these variables exceed a reasonable number, fraud should be considered as a strong possibility. The more interconnections exist amongst identifiers, the greater the cause for concern. Large and tightly-knit graphs are very strong indicators that fraud is taking place. See the graphic below for an example:
A graph of a series of transactions from different IP addresses with a likely fraud event occurring from IP1, which has carried out multiple transactions with five different credit cards.By putting checks into place and associating them with the appropriate event triggers, such schemes can be uncovered before they are able to inflict significant damage. Triggers can include events such as logging in, placing an order or registering a credit card – any of which can cause the transaction to be evaluated against the fraud graph. Fan-out might be skipped, but complex graphs can be flagged as a possible instance of fraud.
ConclusionWhen it comes to graph-based fraud detection, you need to augment your fraud detection capability with link analysis. That being said, two points are clear:
- As business processes become faster and more automated, the time margins for detecting fraud are narrowing, increasing the need for a real-time solution.
- Traditional technologies are not designed to detect elaborate fraud rings. Graph databases add value through analysis of connected data points.
Catch up with the rest of the “Graph Databases in the Enterprise” series:
About the Author
Jim Webber & Ian Robinson, Chief Scientist & Senior Engineer
Jim Webber is Chief Scientist at Neo Technology working on next-generation solutions for massively scaling graph data. Prior to joining Neo Technology, Jim was a Professional Services Director with ThoughtWorks where he worked on large-scale computing systems in finance and telecoms. Jim has a Ph.D. in Computing Science from the Newcastle University, UK.
Ian Robinson is an Senior Engineer at Neo Technology. He is a co-author of ‘REST in Practice’ (O’Reilly) and a contributor to the forthcoming books ‘REST: From Research to Practice’ (Springer) and ‘Service Design Patterns’ (Addison-Wesley). He presents at conferences worldwide on the big Web graph of REST, and the awesome graph capabilities of Neo4j.