ComputerWeekly guest blogpost by Emil Eifrem
Unlike most other ways of looking at data, graph databases are designed to exploit relationships in data, which means they can uncover patterns difficult to detect using traditional representations such as tables. And although developed in-house by the big social web giants (Google, for instance, using graphs, exploited the connections in Web documents to rank search results, namely the ‘Google algorithm’) now these technologies that it took many engineers-hours to construct are available to the wider market. Forrester says over a quarter of enterprises will be using such databases by 2017, for instance, while Gartner believes that over 70% of leading companies will be piloting a graph database by 2018.
As a result, an increasing number of enterprises, from banks to ecommerce firms, are using graphs to solve a variety of complicated data problems in real time, including the speedy detection of fraudulent activity.
Varieties of online hoodwinking
There are various types of fraud – first-party, insurance, and e-commerce fraud, etc. But what they all have in common are layers of deceit. Traditional technologies, while still suitable for certain types of prevention, are simply not designed to detect these layers, which are only really visible by spotting patterns in data and relationships. Graph databases, in contrast, through connected analysis, provide a unique ability to uncover a variety of important fraud patterns, and in real time.
First party fraud is a good example of how graph technology can make a difference, as the complexity of the relationships is what makes these schemes so damaging. Banks lose tens of billions of pounds annually from this form of deception; experts suggest as much as 20% of unsecured bad debt at leading US and European banks is due to this form of opportunistic crime.
However it’s the network of relationships powering this that makes the fraud ring vulnerable to graph-based methods of detection. First-party fraud involves the fraudsters opening bank accounts, taking out loans, credits cards and overdrafts. They initially behave like legitimate customers until the moment they clean out all their accounts and disappear. Collections processes kick in but these account thieves are long gone, repeating the process elsewhere.
A fraud ring like this usually involves two or more people sharing a subset of legitimate contact information to create a series of false identities. In the case of two individuals, sharing only a phone number and address (two pieces of data), they can create four false identities with fake names, each with four to five accounts – a total of 18 accounts. Assuming an average of £4,000 in credit exposure per account, the bank’s loss could be £72,000, perhaps more. The potential loss in a ten-person fraud is no less than £1.5m, assuming 100 false identities and three financial instruments per identity with a £5,000 credit limit, and so on.
To meet the challenge, Gartner has proposed a layered model for fraud prevention that starts with simple discrete methods but which progresses to more elaborate types of analysis, specifically, Entity Link Analysis that leverages connected data. This is another way of saying, look at the relationship patterns – which by definition, is a form of analysis graph databases excel at.Continue reading article… →