Transaction Fraud Ring - Relationship Version
1. Modelling
This section will show examples of cypher queries on an example graph. The intention is to illustrate what the queries look like and provide a guide on how to structure your data in a real setting. We will do this on a small transaction network graph of several nodes connected in a ring structure.
The example graph will be based on the data model below:
1.1. Data Model
1.1.1 Required Fields
Below are the fields required to get started:
Account
Node:
-
accountNumber
: Contains the account name of an account. This could be changed for any other identifier you use for anAccount
.
TRANSACTION
Relationship:
-
amount
: Contains the amount of money transferred between accounts. -
currency
: Contains the currency of the transaction. -
date
: Contains the date the transaction occurred.
1.2. Demo Data
The following Cypher statement will create the example graph in the Neo4j database:
// Create all accounts
CREATE (a1:Account {accountNumber: 1})
CREATE (a2:Account {accountNumber: 2})
CREATE (a3:Account {accountNumber: 3})
CREATE (a4:Account {accountNumber: 4})
// Create relationships between accounts
CREATE (a1)-[:TRANSACTION {amount: 1000, currency: "gbp", date: datetime()-duration({days: 3})}]->(a2)
CREATE (a2)-[:TRANSACTION {amount: 900, currency: "gbp", date: datetime()-duration({days: 2})}]->(a3)
CREATE (a3)-[:TRANSACTION {amount: 810, currency: "gbp", date: datetime()-duration({days: 1})}]->(a4)
CREATE (a4)-[:TRANSACTION {amount: 729, currency: "gbp", date: datetime()}]->(a1)
1.3. Neo4j Schema
// Show neo4j scheme
CALL db.schema.visualization()
It will provide the following response:
After ingesting the data, Neo4j interprets the graph schema. In the schema, we can observe an Account
node connected to another Account
node via a TRANSACTION
relationship. Despite having the same label, the Account
nodes possess different properties, such as accountNumber. However, the schema only considers the labels on nodes and relationships.
2. Cypher Queries
2.1. Simple transaction ring
In this query, we will identify a ring with the following requirements:
-
Account
nodes should be connected via aTRANSACTION
relationship. -
Ensure the direction of the transaction is followed (not a bidirectional query).
-
Find a ring greater than three transactions long but less than 7.
// Identify simple transaction ring
MATCH path=(a:Account)-[:TRANSACTION*3..6]->(a)
RETURN path
2.2. Transaction ring with no duplicate accounts
In this query, we will identify a ring with the following requirements:
-
Account
nodes should be connected via aTRANSACTION
relationship. -
Ensure the direction of the transaction is followed (not a bidirectional query).
-
Find a ring greater than three transactions long but less than 7.
-
Ensure that the ring is made up of unique accounts.
// Identify transaction ring with no duplicate accounts
MATCH path=(a:Account)-[:TRANSACTION*3..6]->(a)
// Here we ensure that one path has unique people involved in the chain
WHERE size(apoc.coll.toSet(nodes(path))) = size(nodes(path)) - 1
// Return all paths
RETURN path
2.3. Transaction ring with chronological transactions
In this query, we will identify a ring with the following requirements:
-
Account
nodes should be connected via aTRANSACTION
relationship. -
Ensure the direction of the transaction is followed (not a bidirectional query).
-
Find a ring greater than three transactions long but less than 7.
-
Ensure that the ring is made up of unique accounts
-
Make sure that the
TRANSACTION
relationships are in chronological order
// Identify transaction ring where dates are in chronological order
MATCH path=(a:Account)-[rel:TRANSACTION*3..6]->(a)
// Here we ensure that one path has unique people involved in the chain
WHERE size(apoc.coll.toSet(nodes(path))) = size(nodes(path)) - 1
// Relationship validation
AND ALL(idx in range(0, size(rel)-2)
// Ensures the dates are in chronological order
WHERE (rel[idx]).date < (rel[idx+1]).date
)
// Return all paths
RETURN path
2.4. Transaction ring with 20% amount deduction
When money is passed through a fraud ring, the amount that moves between accounts is often reduced by a fee of up to 20%. To account for this, our query will allow for a reduction of up to 20% at each transaction.
In this query, we will identify a ring with the following requirements:
-
Account
nodes should be connected via aTRANSACTION
relationship. -
Ensure the direction of the transaction is followed (not a bidirectional query).
-
Find a ring greater than three transactions long but less than 7.
-
Ensure that the ring is made up of unique accounts
-
Make sure that the
TRANSACTION
relationships are in chronological order -
Check that the
TRANSACTION
amount is within 20% of the previous TRANSACTION.
// Identify transaction ring where amounts are within 20% of each other
MATCH path=(a:Account)-[rel:TRANSACTION*3..6]->(a)
// Here we ensure that one path has unique people involved in the chain
WHERE size(apoc.coll.toSet(nodes(path))) = size(nodes(path)) - 1
// Relationship validation
AND ALL(idx in range(0, size(rel)-2)
// Ensures the dates are in chronological order
WHERE (rel[idx]).date < (rel[idx+1]).date
// Checks that there is less than a 20% difference from the last `TRANSACTION` amount to the next
AND (rel[idx+1].amount / rel[idx].amount) * 100 <= 20
)
// Return all paths
RETURN path
2.4.1. What is the query doing?
The given Cypher query is designed to identify suspicious transaction rings in a graph database where accounts are connected by transactions. The query looks for cycles of transactions that fit certain criteria and then returns those cycles. Let’s break down the query step-by-step.
1 - Finding Cyclic Paths
MATCH path=(a:Account)-[rel:TRANSACTION*3..6]→(a)
This line initiates the match clause and looks for paths where an account (a:Account)
is connected to itself through 3 to 6 TRANSACTION relationships (rel:TRANSACTION*3..6)
. These paths form cycles, representing a "ring" of transactions.
2 - Ensuring Unique Accounts
WHERE size(apoc.coll.toSet(nodes(path))) = size(nodes(path)) - 1
This function converts the list of nodes in the path to a set, effectively removing any duplicates.
size(nodes(path)) - 1
This calculates the size of the list of nodes in the path, subtracting 1 to account for the start and end node being the same in a cycle.
The WHERE clause ensures that all accounts in the cycle are unique.
3 - Relationship Validation
AND ALL(idx in range(0, size(rel)-2) WHERE (rel[idx]).date < (rel[idx+1]).date AND (rel[idx+1].amount / rel[idx].amount) * 100 ⇐ 20 )
ALL(idx in range(0, size(rel)-2))
This iterates through each relationship in the path using an index from 0 to size(rel) - 2
.
(rel[idx]).date < (rel[idx+1]).date
Checks that the dates of the transactions are in chronological order.
(rel[idx+1].amount / rel[idx].amount) * 100 ⇐ 20
Checks that the amount of each subsequent transaction is within 20% of the previous transaction’s amount.
4 - Returning the Paths
RETURN path
This line returns the paths that satisfy all the above conditions.
Summary The query identifies transaction rings consisting of 3 to 6 transactions between unique accounts. It further validates the rings by ensuring that the transaction amounts vary by no more than 20% and that the transactions are in chronological order.