Account Takeover Fraud

1. Introduction

Account Takeover Fraud (ATO) represents a sophisticated form of identity theft where cybercriminals gain unauthorised access to legitimate user accounts. This growing threat affects various account types, from financial services to social media platforms, with significant impact on both individuals and organisations. According to recent studies, 22% of U.S. adults have fallen victim to ATO fraud, with average individual losses reaching $12,000. The fraud typically involves credential theft through methods like phishing, data breaches, or social engineering, followed by account exploitation for unauthorised transactions or further fraudulent activities. As digital services expand, robust detection and prevention strategies become increasingly crucial for protecting against this evolving threat.

2. Scenarios

Types of Account Takeover

  • Financial account fraud: Unauthorised transfers, fraudulent purchases, and credit card applications

  • Email account compromise: Access to personal information, password resets, and further account takeovers

  • Social media hijacking: Identity impersonation, scam distribution, and social engineering

Extent of the Problem

  • Widespread impact: 22% of U.S. adults have been victims of account takeover

  • Financial losses: Average individual losses amount to $12,000 per incident

  • Business impact: Significant reputational damage and potential legal consequences

  • Rising sophistication: Increasing use of automated tools and AI for large-scale attacks

Challenges

  • Complex attack vectors: Multiple entry points through phishing, malware, and social engineering

  • Credential stuffing: Automated attacks using stolen username/password combinations

  • Device spoofing: Fraudsters using advanced techniques to bypass device fingerprinting

  • Authentication bypass: Sophisticated methods to circumvent multi-factor authentication

3. Solution

Graph databases provide a powerful approach to detecting and preventing Account Takeover Fraud. By modelling the complex web of user behaviours, device interactions, and account activities as a connected network, graph technology can identify suspicious patterns that traditional systems might miss. This approach is particularly effective for ATO fraud, where multiple data points and relationships must be analysed simultaneously.

3.1. How Graph Databases Can Help?

  1. Device Fingerprinting: Neo4j can track relationships between user accounts, devices, and IP addresses to identify suspicious login patterns and potential credential-stuffing attacks.

  2. Behavioural Analysis: Graph databases excel at modeling normal user behaviour patterns and detecting anomalies, such as:

    • Unusual login times or locations

    • Suspicious changes in transaction patterns

    • Unexpected account setting modifications

    • Abnormal navigation patterns within applications

  3. Identity Verification Networks: Create comprehensive identity graphs that connect:

    • User accounts and associated email addresses

    • Phone numbers and authentication methods

    • Device fingerprints and login locations

    • Transaction patterns and beneficiary relationships

  4. Real-time Detection: Neo4j enables:

    • Instant validation of login attempts against known patterns

    • Real-time analysis of transaction sequences

    • Immediate identification of suspicious IP addresses or devices

    • Dynamic risk scoring based on graph patterns

  5. Network Analysis: Uncover sophisticated fraud rings by:

    • Identifying shared attributes between compromised accounts

    • Detecting clusters of suspicious activity

    • Tracing the spread of credential stuffing attacks

    • Mapping relationships between known fraudulent entities

4. Modelling

This section provides examples of Cypher queries to demonstrate how to structure your data for detecting Account Takeover Fraud. The example graph will include nodes for users, devices, sessions, locations, and authentication events, with relationships showing how these entities interact during normal and suspicious account access patterns.

4.1. Data Model

4.1.1 Required Fields

Customer Node:

  • customerId: Unique identifier for the customer

Device Node:

  • deviceId: Unique identifier for the device

  • deviceType: Type of device (mobile, desktop, tablet)

  • userAgent: Browser/app user agent string

  • createdAt: Timestamp when device was first recorded

IP Node:

  • ipAddress: IP address

  • createdAt: Timestamp when IP was first observed

ISP Node:

  • name: Internet Service Provider name

  • createdAt: Timestamp when ISP was first recorded

Location Node:

  • city: City name

  • postCode: Postal code (optional)

  • country: Country code

  • latitude: Geographic latitude (optional)

  • longitude: Geographic longitude (optional)

  • createdAt: Timestamp when location was first recorded

Session Node:

  • sessionId: Unique session identifier

  • status: Session status (success, failed, suspicious)

  • createdAt: Timestamp when session was initiated

Account Node:

  • accountNumber: Unique account number

Relationships:

  • USED_BY: Device used by customer

  • USES_IP: Session uses IP address

  • SESSION_USES_DEVICE: Session uses device

  • HAS_ACCOUNT: Customer has account

  • IS_ALLOCATED_TO: IP is allocated to ISP

  • LOCATED_IN: IP/Location located in country

4.2. Demo Data

The following Cypher statement will create an example graph demonstrating typical account access patterns:

//
// Create Customer nodes (minimal info for fraud detection)
//
CREATE (c1:Customer {customerId: "CUS001"})
CREATE (c2:Customer {customerId: "CUS002"})
CREATE (c3:Customer {customerId: "CUS003"})

//
// Create Device nodes
//
CREATE (d1:Device {deviceId: "DEV001", deviceType: "desktop", userAgent: "Mozilla/5.0 Chrome/91.0", createdAt: datetime("2024-03-01T09:00:00")})
CREATE (d2:Device {deviceId: "DEV002", deviceType: "mobile", userAgent: "Mozilla/5.0 Mobile Safari/537.36", createdAt: datetime("2024-03-01T09:30:00")})
CREATE (d3:Device {deviceId: "SUSPICIOUS001", deviceType: "desktop", userAgent: "Mozilla/5.0 Firefox/89.0", createdAt: datetime("2024-03-01T10:00:00")})

//
// Create IP nodes
//
CREATE (ip1:IP {ipAddress: "192.168.1.1", createdAt: datetime("2024-03-01T09:00:00")})
CREATE (ip2:IP {ipAddress: "10.0.0.1", createdAt: datetime("2024-03-01T10:00:00")})
CREATE (ip3:IP {ipAddress: "203.0.113.1", createdAt: datetime("2024-03-01T10:05:00")})
CREATE (ip4:IP {ipAddress: "198.51.100.1", createdAt: datetime("2024-03-01T11:00:00")})
CREATE (ip5:IP {ipAddress: "172.16.0.1", createdAt: datetime("2024-03-01T11:05:00")})

//
// Create ISP nodes
//
CREATE (isp1:ISP {name: "BT", createdAt: datetime("2024-01-01T00:00:00")})
CREATE (isp2:ISP {name: "Orange", createdAt: datetime("2024-01-01T00:00:00")})
CREATE (isp3:ISP {name: "Verizon", createdAt: datetime("2024-01-01T00:00:00")})
CREATE (isp4:ISP {name: "China Telecom", createdAt: datetime("2024-01-01T00:00:00")})

//
// Create Location nodes
//
CREATE (l1:Location {city: "London", country: "UK", latitude: 51.5074, longitude: -0.1278, createdAt: datetime("2024-01-01T00:00:00")})
CREATE (l2:Location {city: "Paris", country: "France", latitude: 48.8566, longitude: 2.3522, createdAt: datetime("2024-01-01T00:00:00")})
CREATE (l3:Location {city: "Beijing", country: "China", latitude: 39.9042, longitude: 116.4074, createdAt: datetime("2024-01-01T00:00:00")})
CREATE (l4:Location {city: "Lagos", country: "Nigeria", latitude: 6.5244, longitude: 3.3792, createdAt: datetime("2024-01-01T00:00:00")})
CREATE (l5:Location {city: "New York", country: "USA", latitude: 40.7128, longitude: -74.0060, createdAt: datetime("2024-01-01T00:00:00")})

//
// Create Session nodes (incorporating event data)
//
CREATE (s1:Session {sessionId: "SESS001", status: "success", createdAt: datetime("2024-03-01T10:00:00")})
CREATE (s2:Session {sessionId: "SESS002", status: "success", createdAt: datetime("2024-03-01T10:05:00")})
CREATE (s3:Session {sessionId: "SESS003", status: "failed", createdAt: datetime("2024-03-01T11:00:00")})
CREATE (s4:Session {sessionId: "SESS004", status: "failed", createdAt: datetime("2024-03-01T11:05:00")})
CREATE (s5:Session {sessionId: "SESS005", status: "failed", createdAt: datetime("2024-03-01T11:10:00")})


//
// Create Account nodes
//
CREATE (a1:Account {accountNumber: "ACC001"})
CREATE (a2:Account {accountNumber: "ACC002"})
CREATE (a3:Account {accountNumber: "ACC003"})

//
// Create Relationships
//

// Pattern 1: Single device logging into multiple accounts (credential stuffing)
CREATE (d3)-[:USED_BY {lastUsed: datetime("2024-03-01T10:00:00")}]->(c1)
CREATE (d3)-[:USED_BY {lastUsed: datetime("2024-03-01T10:02:00")}]->(c2)
CREATE (d3)-[:USED_BY {lastUsed: datetime("2024-03-01T10:04:00")}]->(c3)

// Pattern 2: Impossible travel - UK to China in 5 minutes
CREATE (s1)-[:USES_IP]->(ip1)
CREATE (ip1)-[:LOCATED_IN {createdAt: datetime("2024-03-01T10:00:00")}]->(l1)
CREATE (s2)-[:USES_IP]->(ip3)
CREATE (ip3)-[:LOCATED_IN {createdAt: datetime("2024-03-01T10:05:00")}]->(l3)

// Pattern 3: Multiple failed login attempts from different IPs
CREATE (s3)-[:USES_IP]->(ip2)
CREATE (ip2)-[:LOCATED_IN {createdAt: datetime("2024-03-01T11:00:00")}]->(l2)
CREATE (s4)-[:USES_IP]->(ip4)
CREATE (ip4)-[:LOCATED_IN {createdAt: datetime("2024-03-01T11:05:00")}]->(l4)
CREATE (s5)-[:USES_IP]->(ip5)
CREATE (ip5)-[:LOCATED_IN {createdAt: datetime("2024-03-01T11:10:00")}]->(l5)

// ISP relationships
CREATE (ip1)-[:IS_ALLOCATED_TO {createdAt: datetime("2024-01-01T00:00:00")}]->(isp1)
CREATE (ip2)-[:IS_ALLOCATED_TO {createdAt: datetime("2024-01-01T00:00:00")}]->(isp2)
CREATE (ip3)-[:IS_ALLOCATED_TO {createdAt: datetime("2024-01-01T00:00:00")}]->(isp4)
CREATE (ip4)-[:IS_ALLOCATED_TO {createdAt: datetime("2024-01-01T00:00:00")}]->(isp3)
CREATE (ip5)-[:IS_ALLOCATED_TO {createdAt: datetime("2024-01-01T00:00:00")}]->(isp3)

// Session device relationships
CREATE (s1)-[:SESSION_USES_DEVICE]->(d1)
CREATE (s2)-[:SESSION_USES_DEVICE]->(d3)
CREATE (s3)-[:SESSION_USES_DEVICE]->(d2)
CREATE (s4)-[:SESSION_USES_DEVICE]->(d2)
CREATE (s5)-[:SESSION_USES_DEVICE]->(d2)

// Customer account relationships
CREATE (c1)-[:HAS_ACCOUNT {role: "owner", since: datetime("2024-01-01T00:00:00")}]->(a1)
CREATE (c2)-[:HAS_ACCOUNT {role: "owner", since: datetime("2024-01-01T00:00:00")}]->(a2)
CREATE (c3)-[:HAS_ACCOUNT {role: "owner", since: datetime("2024-01-01T00:00:00")}]->(a3)

// Connect sessions to customers (for failed login attempts)
CREATE (c1)-[:HAS_SESSION]->(s1)
CREATE (c1)-[:HAS_SESSION]->(s2)
CREATE (c2)-[:HAS_SESSION]->(s3)
CREATE (c2)-[:HAS_SESSION]->(s4)
CREATE (c2)-[:HAS_SESSION]->(s5)

4.3. Neo4j Schema

If you call:

// Show neo4j schema
CALL db.schema.visualization()

You will see the following response:

fs account takeover fraud schema

5. Cypher Queries

5.1. Single device logging into multiple different accounts

In this query, we will identify devices that have been used to access multiple different user accounts, which is a common pattern in credential stuffing attacks and account takeover attempts.

View Graph:

// Show the relationships between suspicious devices and multiple accounts
MATCH path=(d:Device)-[:USED_BY]->(c:Customer)-[:HAS_ACCOUNT]->(a:Account)
WITH d, count(c) as accountCount
WHERE accountCount > 1
MATCH path=(d)-[:USED_BY]->(c:Customer)-[:HAS_ACCOUNT]->(a:Account)
RETURN path

View Statistics:

// Get detailed statistics about devices accessing multiple accounts
MATCH (d:Device)-[:USED_BY]->(c:Customer)-[:HAS_ACCOUNT]->(a:Account)
WITH d,
     count(c) as uniqueAccounts,
     collect(c.customerId) as compromisedCustomers,
     d.deviceType as deviceType,
     d.userAgent as userAgent
WHERE uniqueAccounts > 1
RETURN d.deviceId as DeviceID,
       deviceType as DeviceType,
       userAgent as UserAgent,
       uniqueAccounts as NumberOfAccounts,
       compromisedCustomers as CompromisedCustomerIDs
ORDER BY uniqueAccounts DESC

What It Does:

  • First query visualises the network of suspicious devices and their connections to multiple accounts

  • Second query provides detailed statistics about each suspicious device, including:

  • Number of unique accounts accessed

  • Device type and user agent information

  • List of potentially compromised email accounts

Risk Indicators:

  • Devices accessing more than 2 different accounts within 24 hours

  • Failed login attempts across multiple accounts

  • Suspicious user agent strings or device characteristics

  • Rapid succession of login attempts indicating automated attacks

5.2. Suspicious Session Patterns

In these queries, we analyse session patterns to identify potential account takeover attempts through unusual session behaviours, failed login attempts, and suspicious location changes within sessions.

View Failed Login Attempts:

// Show clusters of failed login attempts within a time window
MATCH (c:Customer)-[:HAS_SESSION]->(s:Session)
WHERE s.status = 'failed'
WITH c, s
ORDER BY s.createdAt
WITH c,
     collect({
         sessionId: s.sessionId,
         sessionTime: s.createdAt,
         status: s.status
     }) as attempts
WHERE size(attempts) >= 3
RETURN c.customerId as CustomerID,
       attempts,
       size(attempts) as FailedAttempts
ORDER BY FailedAttempts DESC

View Location Changes:

// Detect rapid location changes within sessions (impossible travel)
MATCH (c:Customer)-[:HAS_SESSION]->(s:Session)-[:USES_IP]->(ip:IP)-[:LOCATED_IN]->(l:Location)
WITH c, s, l
ORDER BY s.createdAt
WITH c,
     collect({
         location: l.city + ', ' + l.country,
         sessionTime: s.createdAt
     }) as locations
WHERE size(locations) > 1
RETURN c.customerId as CustomerID,
       locations,
       size(locations) as LocationChanges
ORDER BY LocationChanges DESC

View Session Timeline:

// Analyse session patterns over time
MATCH (c:Customer)-[:HAS_SESSION]->(s:Session)-[:SESSION_USES_DEVICE]->(d:Device)
WITH c, d, s
RETURN c.customerId as CustomerID,
       d.deviceId as DeviceID,
       d.deviceType as DeviceType,
       s.createdAt as SessionTime,
       s.status as SessionStatus
ORDER BY s.createdAt

What It Does:

  • First query identifies clusters of failed login attempts:

  • Groups failed attempts by user

  • Shows the sequence and timing of failures

  • Helps identify brute force attacks

  • Second query detects suspicious location changes:

  • Tracks location changes within user sessions

  • Identifies physically impossible travel patterns

  • Helps spot location spoofing or compromised accounts

  • Third query analyses session patterns:

  • Shows the complete timeline of session events

  • Tracks device changes within sessions

  • Measures session duration and activity patterns

Risk Indicators:

  • Multiple failed login attempts within a short time window

  • Rapid changes in login location

  • Unusual session duration or activity patterns

  • Multiple devices used within single session

  • Mismatched device types or user agents

  • Sessions outside normal user patterns

5.3. Multiple Failed Login Attempts from Different IPs

In these queries, we analyse patterns of failed login attempts from different IP addresses targeting the same account, which is a common indicator of brute force attacks.

View Failed Login Pattern:

// Show accounts with multiple failed login attempts from different IPs
MATCH (c:Customer)-[:HAS_SESSION]->(s:Session)-[:USES_IP]->(ip:IP)
WHERE s.status = 'failed'
WITH c, count(DISTINCT ip) as uniqueIPs, collect(DISTINCT ip.ipAddress) as ipAddresses,
     count(s) as totalFailedAttempts
WHERE uniqueIPs >= 2
RETURN c.customerId as TargetCustomer,
       totalFailedAttempts as FailedAttempts,
       uniqueIPs as NumberOfUniqueIPs,
       ipAddresses as IPAddresses
ORDER BY totalFailedAttempts DESC

View Detailed Timeline:

// Show detailed timeline of failed attempts with location context
MATCH (c:Customer)-[:HAS_SESSION]->(s:Session)-[:USES_IP]->(ip:IP),
      (ip)-[:LOCATED_IN]->(l:Location),
      (ip)-[:IS_ALLOCATED_TO]->(isp:ISP)
WHERE s.status = 'failed'
WITH c, count(DISTINCT ip) as uniqueIPs
WHERE uniqueIPs >= 2
MATCH (c)-[:HAS_SESSION]->(s:Session)-[:USES_IP]->(ip:IP),
      (ip)-[:LOCATED_IN]->(l:Location),
      (ip)-[:IS_ALLOCATED_TO]->(isp:ISP)
WHERE s.status = 'failed'
RETURN c.customerId as TargetCustomer,
       s.createdAt as AttemptTime,
       ip.ipAddress as IPAddress,
       l.city + ', ' + l.country as Location,
       isp.name as ISP
ORDER BY c.customerId, s.createdAt

View Geographic Distribution:

// Show geographic distribution of failed attempts
MATCH (c:Customer)-[:HAS_SESSION]->(s:Session)-[:USES_IP]->(ip:IP)-[:LOCATED_IN]->(l:Location)
WHERE s.status = 'failed'
WITH c, l, count(s) as attemptsFromLocation
WITH c,
     count(DISTINCT l) as uniqueLocations,
     collect(DISTINCT {
         location: l.city + ', ' + l.country,
         attempts: attemptsFromLocation
     }) as locationBreakdown
WHERE uniqueLocations >= 2
RETURN c.customerId as TargetCustomer,
       uniqueLocations as NumberOfLocations,
       locationBreakdown as LocationBreakdown
ORDER BY uniqueLocations DESC

What It Does:

  • First query provides an overview of accounts under attack:

  • Counts total failed attempts per account

  • Shows number of unique IPs used

  • Lists all IP addresses involved

  • Second query shows the detailed timeline:

  • Chronological sequence of failed attempts

  • Geographic location of each attempt

  • ISP information for each IP

  • Helps identify attack patterns and timing

  • Third query analyses geographic distribution:

  • Shows number of unique locations

  • Provides breakdown of attempts per location

  • Helps identify geographically dispersed attacks

Risk Indicators:

  • Multiple failed attempts from different IPs within a short timeframe

  • Geographically impossible location changes between attempts

  • Failed attempts from known high-risk ISPs or locations

  • Systematic pattern in timing of attempts suggesting automation

  • Large number of unique IPs targeting single account