GraphGists

Introduction

A CDR contains the phone numbers originating the call and receiving the call, time of the call, call duration and many more attributes. CDR data can reach big data proportions for some Enterprise Businesses.

Neo4j has been used in telecom industry: churn reduction (https://neo4j.com/news/0227-reducing-telecom-churn/) and event log data( https://vimeo.com/79390660). These applications are at the telecom provider level.

churn2

Here I use Neo4j at the business (corporations) level that consume the services provided by the telecom provider.

business2

Solution

Incoming calls: Answered or went to voicemail.

Incoming calls: Forwarded, answered or went to voicemail.

Outgoing calls (Extension to Extension): Answered or went to voicemail.

Outgoing calls (External): Called

Direct to voicemail: Voicemail

This graph model stores call detail records (CDR) in nodes, relationships and properties. Call date is modelled as integer following the article: https://www.menome.com/wp/neo4j-modelling-dates/

Data Model

DataModel

Metadata

Metadata

Setup

Here is a sample of data in the .csv file:

TME: Epoch time, CLNG: Calling number, CLED: Called number, FCLED: Call forwarded number (80000 is voicemail), TYP: Call type, ANS: Answered or Voicemail, FWD: Forwarded number, CPNY:Company name

TME,CLNG,CLED,FCLED,DUR,TYP,ANS,FWD,CPNY 1449158574,4045551212,4041212555,80000,70,Incoming,VM,FWD,AmeyaSoft 1449159132,9195551212,8191212555,8191212555,458,Outgoing,Ans,Orig,AmeyaSoft 1449159135,8045551212,5401212555,80000,304,Incoming,VM,FWD,AmeyaSoft 1449159146,2705551212,8151212555,8151212555,451,Incoming,Ans,FWD,AmeyaSoft

A Cypher query to setup the database (used only 50 call records)

cdranalytic

Incoming Calls

MATCH (c:Company)-[]->(n0:CallDate)-[]->(n:Connect)-[r:INCOMING]->(n1)-[]->(n2)
RETURN c, n0, n, r, n1, n2;
Incoming

Voicemail

MATCH (c:Company)-[]->(n0:CallDate)-[]->(n:Connect)-[r:VOICEMAIL]->(n1)-[]->(n2)
RETURN c, n0, n, r, n1, n2;
Voicemail

Call Distributions

Total calls by call type:
MATCH (n:Connect)
WITH count(n) as Total
MATCH (n:Connect)-[r:INCOMING]->(n1)
WITH Total, COUNT(n) as Incoming
MATCH (n:Connect)-[r:OUTGOING]->(n1)
WITH Total, Incoming, COUNT(n) as Outgoing
MATCH (n:Connect)-[r:VOICEMAIL]->(n1)
WITH Total, Incoming, Outgoing, COUNT(n) as Voicemail
RETURN Total, Incoming, Outgoing, Voicemail;
Totalcalls
Phone numbers and total calls by call type:
//Total calls by phone number
MATCH (n:Connect)-[]->(n1)-[]->(n2)
RETURN n2.orig as PhoneNbr, COUNT(n) as Count ORDER BY Count DESC;
//
MATCH (n:Connect)-[r:INCOMING]->(n1)-[]->(n2)
RETURN n2.orig as Incoming, COUNT(n) as Count ORDER BY Count  DESC;
MATCH (n:Connect)-[r:OUTGOING]->(n1)-[]->(n2)
RETURN n2.orig as Outgoing, COUNT(n) as Count ORDER BY Count  DESC;
MATCH (n:Connect)-[r:VOICEMAIL]->(n1)-[]->(n2)
RETURN n2.orig as Voicemail, COUNT(n) as Count ORDER BY Count  DESC;

Count PhoneInCallsOtcallsVM

10 out of 11 calls to 4041212555 went to voicemail.

More Details…​

A graph showing multiple calls both incoming and outgoing for a given time (with a bigger dataset).

MATCH (n:Connect)-[r:VOICEMAIL {call: 'Incoming'}]-(n1:VoiceCall)
WITH DISTINCT n.id as inc
MATCH (n1:Connect)-[r1:VOICEMAIL {call: 'Outgoing'}]-(n2:VoiceCall)
WHERE n1.id = inc
WITH DISTINCT n1.id as out
MATCH (x:Company)-[]->(n0:CallDate)-[]->(n2:Connect)-[r2:INCOMING|VOICEMAIL|OUTGOING]->(n11)-[]->(n21)
WHERE n2.id = out
RETURN x, n0, n2, r2, n11, n21;
clusters

Resources