Deposit Analysis

1. Introduction

The retail banking sector faces significant challenges in managing deposits due to shifting interest rates, evolving customer behaviours, and increasing technological advancements. Deposit growth is expected to remain sluggish through 2025, whilst the cost of deposits has increased, putting pressure on banks' net interest income. There’s a competitive war for deposits, driven by the need for liquidity and depositors' reluctance to accept lower rates. Furthermore, macroeconomic headwinds and political uncertainty impact consumer behaviour and corporate borrowing. Banks are focusing on new customer acquisition and retention, personalised pricing strategies, and superior digital experiences to navigate these challenges.

2. Scenario

Graph databases excel at analysing the complexities of retail banking deposits. By leveraging the relationships between customers, accounts, and transactions, they enable banks to model and visualise customer behaviour. This empowers banks to optimise pricing strategies, identify potential risks, and personalise offers based on individual needs and market trends.

Compared to traditional systems, graph databases provide real-time analysis, faster response times, and superior data modelling capabilities. These advantages translate to improved customer experiences, enhanced risk management, and increased profitability for retail banks.

3. Solution

Graph databases offer a unique approach to analysing banking deposits, going beyond the limitations of traditional relational databases. Graph databases model complex relationships and patterns between customers, accounts, and transactions, which is critical for understanding deposit behaviours. This is very similar to how graph databases are used in fraud detection, as described in the Claims Fraud document. Using graph theory, graph databases can accurately represent entities and their connections, thus helping banks better manage deposits by providing a clear view of the relationships between data points. They allow for more sophisticated analysis to be done on connections within a dataset, going beyond the simple data within the dataset itself. This approach facilitates real-time analysis, allowing for quicker responses to changes in the market or customer activity, something traditional systems struggle with.

3.1. How Graph Databases Can Help?

  • Link Analysis: Just as Neo4j can explore connections to uncover complex patterns, it can explore the connections between customer demographics, account activities, and transaction histories to uncover complex deposit patterns. This helps banks visualise how different factors interact and influence deposit trends.

  • Pattern Detection: Similar to how graph databases excel at identifying fraud patterns without a starting point, they can analyse deposit data to reveal hidden relationships and identify emerging trends. This can include identifying groups of customers with similar deposit behaviours or accounts related to unusual fund flows.

  • Real-time Analysis: Graph databases can monitor and analyse deposit activities in real-time. This allows banks to quickly identify unusual changes in deposit behaviour, analyse the impact of dynamic pricing strategies, and respond swiftly to market changes.

Graph databases offer distinct advantages over traditional systems, particularly when dealing with the complex and interconnected nature of deposit data. By adopting this technology, banks can enhance their understanding of customer behaviour and improve their deposit strategies. This approach mirrors the effective use of graph databases in fraud detection.

4. Modelling

This section will show examples of cypher queries on an example graph. The intention is to illustrate what the queries look like and provide a guide on how to structure your data in a real setting. We will do this on a small graph of several nodes. The example graph will be based on the data model below:

4.1. Data Model

fs deposit analysis date model

4.1.1 Required Fields

Minimal fields required for deposit analysis:

Customer Node:

  • customerId: Unique identifier for the customer

Account Node:

  • accountId: Unique identifier for the account

  • accountNumber: Account number

Transaction Node:

  • transactionId: Unique identifier for the transaction

  • amount: The amount of the transaction

  • timestamp: The timestamp of the transaction

  • type: The transaction type (e.g., "CASH_DEPOSIT")

Relationships:

  • HAS_ACCOUNT: Connects a Customer node to an Account node

  • PERFORMS: Connects an Account node to a Transaction node (outgoing)

  • BENEFITS_TO: Connects a Transaction node to an Account node (incoming)

4.2. Demo Data

// Create customers
CREATE (c1:Customer {customerId: "CUST_001"})
CREATE (c2:Customer {customerId: "CUST_002"})
CREATE (c3:Customer {customerId: "CUST_003"})

// Create accounts
CREATE (a1:Account {accountId: "ACC_001", accountNumber: "1000001"})
CREATE (a2:Account {accountId: "ACC_002", accountNumber: "1000002"})
CREATE (a3:Account {accountId: "ACC_003", accountNumber: "1000003"})

// Create relationships - some accounts are shared between customers
CREATE (c1)-[:HAS_ACCOUNT]->(a1)
CREATE (c1)-[:HAS_ACCOUNT]->(a2)
CREATE (c2)-[:HAS_ACCOUNT]->(a2)
CREATE (c2)-[:HAS_ACCOUNT]->(a3)
CREATE (c3)-[:HAS_ACCOUNT]->(a3)

// Create transactions (cash deposits)
CREATE (t1:Transaction {transactionId: "TXN_001", amount: 3000.0, timestamp: datetime()-duration('P2M'), type: "CASH_DEPOSIT"})
CREATE (t2:Transaction {transactionId: "TXN_002", amount: 5000.0, timestamp: datetime()-duration('P1M'), type: "CASH_DEPOSIT"})
CREATE (t3:Transaction {transactionId: "TXN_003", amount: 1000.0, timestamp: datetime(), type: "CASH_DEPOSIT"})
CREATE (t4:Transaction {transactionId: "TXN_004", amount: 4000.0, timestamp: datetime(), type: "CASH_DEPOSIT"})
CREATE (t5:Transaction {transactionId: "TXN_005", amount: 2000.0, timestamp: datetime(), type: "CASH_DEPOSIT"})

// Create transaction relationships
CREATE (a1)-[:PERFORMS]->(t1)-[:BENEFITS_TO]->(a1)
CREATE (a1)-[:PERFORMS]->(t2)-[:BENEFITS_TO]->(a1)
CREATE (a1)-[:PERFORMS]->(t3)-[:BENEFITS_TO]->(a1)
CREATE (a2)-[:PERFORMS]->(t4)-[:BENEFITS_TO]->(a2)
CREATE (a3)-[:PERFORMS]->(t5)-[:BENEFITS_TO]->(a3)

4.3. Neo4j Scheme

If you call:

// Show neo4j scheme
CALL db.schema.visualization()

You will see the following response showing Customer, Account, and Transaction nodes with HAS_ACCOUNT, PERFORMS, and BENEFITS_TO relationships:

fs deposit analysis schema

5. Cypher Queries

These queries demonstrate how graph databases support key deposit management objectives: identifying consistent depositors for retention, analysing customer relationships for household banking opportunities, and understanding deposit flow patterns for pricing strategies.

5.1. Recent Deposit Activity Analysis

Identify all recent deposit activity to understand current deposit flows and customer engagement patterns. This supports liquidity planning and helps identify active customers who may be responsive to competitive rate offers.

// Find all deposits in the last month with customer and account context
MATCH path = (c:Customer)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORMS]->(t:Transaction)-[:BENEFITS_TO]->(a)
WHERE t.timestamp > datetime()-duration('P1M')
AND t.type = "CASH_DEPOSIT"

RETURN path

This query visualizes the deposit flow patterns, showing how customers connect to accounts and their recent transaction activities. For analytical results, use:

// Same query returning tabular data for analysis
MATCH (c:Customer)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORMS]->(t:Transaction)-[:BENEFITS_TO]->(a)
WHERE t.timestamp > datetime()-duration('P1M')
AND t.type = "CASH_DEPOSIT"

RETURN
    c.customerId,
    a.accountNumber,
    t.amount,
    t.timestamp
ORDER BY t.timestamp DESC

Business Value: Provides real-time view of deposit activity for liquidity management and identifies recently active customers for targeted retention campaigns.

5.2. Consistent High-Value Depositor Identification

Identify customers who demonstrate consistent deposit behaviour - these are high-value customers for retention and premium pricing strategies. The query finds customers with sustained deposit activity and consistent deposit levels.

Business Objective:

These customers are prime candidates for:

  • Premium deposit rates and personalised pricing

  • Retention campaigns during competitive periods

  • Relationship banking and additional product offerings

  • Priority customer service and dedicated relationship management

// Find consistent high-value depositors for retention and premium pricing
MATCH (c:Customer)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORMS]->(t:Transaction)-[:BENEFITS_TO]->(a)
WHERE t.type = "CASH_DEPOSIT"
AND t.timestamp > datetime()-duration('P3M')

WITH c, a, t,
  CASE
    WHEN t.timestamp > datetime()-duration('P1M') THEN 'current'
    ELSE 'prior_two_months'
  END AS period

WITH c, period, SUM(t.amount) AS total_deposits, COUNT(t) AS transaction_count
WHERE total_deposits > 0

WITH c,
  COALESCE(SUM(CASE WHEN period = 'current' THEN total_deposits END), 0) AS current_month,
  COALESCE(SUM(CASE WHEN period = 'prior_two_months' THEN total_deposits END), 0) AS prior_two_months,
  COALESCE(SUM(CASE WHEN period = 'current' THEN transaction_count END), 0) AS current_txn_count,
  COALESCE(SUM(CASE WHEN period = 'prior_two_months' THEN transaction_count END), 0) AS prior_txn_count

// Filter for consistent depositors (maintained at least 50% of deposit level)
WHERE current_month > 0
AND prior_two_months > 0
AND (current_month / prior_two_months) >= 0.5

// Return customer analysis with retention priority scoring
MATCH (c)-[:HAS_ACCOUNT]->(a:Account)
RETURN
    c.customerId,
    current_month,
    prior_two_months,
    current_month + prior_two_months AS total_deposits_3m,
    current_txn_count + prior_txn_count AS total_transactions,
    (current_month + prior_two_months) / (current_txn_count + prior_txn_count) AS avg_deposit_size,
    current_month / prior_two_months AS consistency_ratio,
    count(DISTINCT a) AS account_count,
    // Priority score for retention campaigns
    (current_month + prior_two_months) * (current_month / prior_two_months) AS retention_priority_score
ORDER BY retention_priority_score DESC

Business Value: Identifies high-value, consistent depositors who should receive premium pricing, dedicated relationship management, and priority retention efforts during competitive periods.

5.3. Household and Relationship Banking Analysis

Identify customers who share accounts to understand household relationships and joint financial management. This enables household-level deposit strategies, family banking packages, and relationship-based pricing.

Business Objective:

Understanding account sharing patterns helps with:

  • Household Banking Strategies: Offer family account packages and joint pricing

  • Relationship Retention: Losing one family member could impact multiple accounts

  • Cross-selling Opportunities: Identify family units for comprehensive financial services

  • Risk Assessment: Understand deposit concentration within family units

// Visualize household banking relationships and shared account networks
MATCH path = (c1:Customer)-[:HAS_ACCOUNT]->(a:Account)<-[:HAS_ACCOUNT]-(c2:Customer)
WHERE c1.customerId < c2.customerId  // Avoid duplicate pairs

// Include transaction paths for shared accounts
OPTIONAL MATCH txPath = (a)-[:PERFORMS]->(t:Transaction)-[:BENEFITS_TO]->(a)
WHERE t.timestamp > datetime()-duration('P3M')
AND t.type = "CASH_DEPOSIT"

// Return all paths showing household connections
WITH path, collect(txPath) as transactionPaths
LIMIT 25

RETURN path, transactionPaths

This visualization reveals household banking networks by showing customers connected through shared accounts and their transaction flows. For household-level analytics:

// Analyze household/relationship banking patterns through shared accounts
MATCH (c1:Customer)-[:HAS_ACCOUNT]->(a:Account)<-[:HAS_ACCOUNT]-(c2:Customer)
WHERE c1.customerId < c2.customerId  // Avoid duplicate pairs
WITH c1, c2, a

// Get deposit activity for shared accounts over last 3 months
OPTIONAL MATCH (a)-[:PERFORMS]->(t:Transaction)-[:BENEFITS_TO]->(a)
WHERE t.timestamp > datetime()-duration('P3M')
AND t.type = "CASH_DEPOSIT"

// Calculate household deposit metrics
WITH c1, c2, a,
     count(t) as depositCount,
     coalesce(sum(t.amount), 0) as totalDeposits,
     coalesce(avg(t.amount), 0) as avgDepositSize

// Aggregate at household level
WITH c1, c2,
     collect(a.accountNumber) as sharedAccounts,
     count(a) as numberOfSharedAccounts,
     sum(depositCount) as totalTransactions,
     sum(totalDeposits) as householdDepositValue,
     avg(avgDepositSize) as avgHouseholdDepositSize

// Calculate household banking potential
RETURN
    c1.customerId as primaryCustomer,
    c2.customerId as relationshipCustomer,
    numberOfSharedAccounts,
    householdDepositValue,
    totalTransactions,
    avgHouseholdDepositSize,
    // Household value score for relationship banking offers
    householdDepositValue * numberOfSharedAccounts as householdValue
ORDER BY householdValue DESC

Business Value: Enables household-level deposit strategies, relationship banking packages, and helps identify high-value family units for premium services and retention efforts.

6. Graph Data Science (GDS)

Graph Data Science algorithms help banks optimise deposit strategies by analysing customer behaviour patterns, identifying high-value customer segments, and understanding deposit flow networks. These insights support personalised pricing strategies, customer retention efforts, and competitive positioning in the deposit market.

6.1. Customer Segmentation for Personalised Pricing

Community detection helps identify customer segments with similar deposit behaviours, enabling banks to develop targeted pricing strategies and personalised offers. This directly supports the business goal of implementing personalised pricing strategies to compete for deposits.

Customer-Account Bipartite Projection

First, create a focused projection that captures customer-account relationships:

// Create customer-account bipartite graph for segmentation analysis
CALL gds.graph.project(
    'customerAccountNetwork',
    ['Customer', 'Account'],
    'HAS_ACCOUNT'
)

Community Detection for Customer Segments

// Run Louvain algorithm to identify customer segments
CALL gds.louvain.stream('customerAccountNetwork')
YIELD nodeId, communityId
WITH gds.util.asNode(nodeId) AS node, communityId
WHERE 'Customer' IN labels(node)

// Analyze segment characteristics
MATCH (c:Customer)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORMS]->(t:Transaction)
WHERE c.customerId = node.customerId
AND t.type = 'CASH_DEPOSIT'
AND t.timestamp > datetime()-duration('P3M')

WITH communityId, c,
     count(t) as depositFrequency,
     sum(t.amount) as totalDeposits,
     avg(t.amount) as avgDepositSize

RETURN
    communityId,
    count(c) as customersInSegment,
    avg(depositFrequency) as avgDepositFrequency,
    avg(totalDeposits) as avgTotalDeposits,
    avg(avgDepositSize) as avgDepositSize
ORDER BY avgTotalDeposits DESC

Business Value: - Identifies customer segments with similar deposit patterns - Enables targeted pricing strategies for different segments - Supports personalised retention offers based on segment behaviour - Helps optimise marketing campaigns for specific customer groups

6.2. High-Value Customer Identification

Centrality analysis identifies customers who are most valuable to the bank’s deposit strategy, helping prioritize retention efforts and competitive responses.

Customer-to-Customer Network via Shared Accounts

// Create customer network based on shared accounts and deposit volumes
CALL gds.graph.project.cypher(
    'customerDepositNetwork',
    'MATCH (c:Customer) RETURN id(c) AS id, labels(c) AS labels',
    'MATCH (c1:Customer)-[:HAS_ACCOUNT]->(a:Account)<-[:HAS_ACCOUNT]-(c2:Customer)
     WITH c1, c2, a
     MATCH (a)-[:PERFORMS]->(t:Transaction)
     WHERE t.type = "CASH_DEPOSIT" AND t.timestamp > datetime()-duration("P3M")
     RETURN id(c1) AS source, id(c2) AS target, sum(t.amount) AS weight'
)

PageRank for Customer Importance

// Identify high-value customers based on network importance
CALL gds.pageRank.stream('customerDepositNetwork', {
    relationshipWeightProperty: 'weight'
})
YIELD nodeId, score

// Get customer details and deposit behaviour
WITH gds.util.asNode(nodeId) AS customer, score
MATCH (customer)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORMS]->(t:Transaction)
WHERE t.type = 'CASH_DEPOSIT'
AND t.timestamp > datetime()-duration('P3M')

WITH customer, score,
     count(DISTINCT a) as accountCount,
     sum(t.amount) as totalDeposits,
     count(t) as transactionCount

RETURN
    customer.customerId,
    score as networkImportance,
    accountCount,
    totalDeposits,
    transactionCount,
    totalDeposits/transactionCount as avgTransactionSize
ORDER BY score DESC
LIMIT 10

Business Value: - Identifies customers with highest network value for retention priority - Supports competitive response strategies for high-value customers - Enables proactive customer management and personalised service - Helps optimise resource allocation for customer acquisition and retention