GraphGists

This interactive Neo4j graph tutorial shows how ecommerce websites can use their data to identify reshipping scams.


Table of Contents


Introduction to Problem

If you have been frequenting the internet at any point during the past 10 years you may have come into contact with a job ad for re-shipping. Reshipping is used by fraudsters to launder the money from their stolen credit card.

It works like this :

  • the criminals steal credit cards information ;

  • they buy goods on ecommerce websites ;

  • the goods are sent to a third party ;

  • the third party receives the goods and re-ships them to the criminal ;

  • the criminal sells the goods and receives cash ;

The third party, recruited via a job ad promising a generous compensation, acts as mule.

Money-laundering is the last stage in credit card fraud…and the last opportunity to act before it is too late. We are going to see how ecommerce websites can identify reshipping scams and save money.


Our data model for fraud detection

A typical ecommerce website can model its orders data as this :

A graph data model to detect reshipping scams.

There is a couple of things we can do with that data to identify fraud. A first step might be to compare the billing and shipping address. A difference between a billing and a shipping address might be indicative of a reshipping scam. Furthermore we can look into the IP address. If the IP address localization does not match the billing address or the shipping address, the situation is highly suspicious.

We are going to see how to perform these security checks with a graph database.


Sample Data Set

I have prepared a small dataset with a few (fake) ecommerce orders. It includes regular transactions and fraudulent transactions.


You can download the complete dataset here.

See the list of transactions

Let’s start by looking at the transactions recorded on our website.

MATCH (orders:Transaction)
RETURN DISTINCT orders.date as date, orders.items as items, orders.amount as amount
ORDER BY amount DESC

See the transactions where the billing and shipping addresses are different

If the shipping address and the billing address are different, maybe we are looking at a reshipping scam. We want to identify these transactions for analysis.

MATCH (address1:Address)-[IS_SHIPPING_ADDRESS]->(suspiciousorder:Transaction)<-[:IS_BILLING_ADDRESS]-(address2:Address)
WHERE address1 <> address2
RETURN DISTINCT suspiciousorder

Are there some suspicious IPs

Even more suspicious are the transactions where the IP address is coming from a location different from the billing and shipping addresses. Here is how to do identify this pattern :

MATCH (a:Transaction)-[r*2..3]-(b:City)
WITH a, COUNT(DISTINCT b) AS group_size, COLLECT(DISTINCT b) AS cities
WHERE group_size > 2
RETURN a, cities

Conclusion

Of course the data we have used is here is fake. Furthermore, the fraudsters could use more advanced techniques (a simple proxy for example) to avoid detection. Nevertheless, improving the approach of identifying fraudulent patterns and looking for them can be used successfully to fight against reshipping and ecommerce fraud.

For more graph-related use cases, make sure to check the blog of Linkurious.