GraphGists

Cyber security and attack analysis

This interactive Neo4j graph tutorial shows how to run an attack analysis and improve IT security.


Table of Contents


Introduction to Problem

A zero-day exploit is a previously undiscovered security flaw in a software. Between the moment it is discovered and until the software is patched by those who use it, hackers can use the flaw to compromise systems. The flaw can be used in a phishing attack where a criminal masquerades as a trustworthy entity to obtain sensitive information. We are going to study an example of a phishing attack that uses a zero-day exploit.

Recently an Internet Explorer zero-day exploit (CVE-2014-1776) became public. Following the announcement of the IE security flaw, a group of hackers sent mails to victims who were asked to login into a website where their identification information was captured.

We are going to use graphs to analyse the origin the attack and block potential new mails.


Our data model for attack analysis

Just like any other web domains, the domains used in the phishing attack are linked to a couple of entities :

  • an IP address : a numerical label assigned to each device (e.g., computer, printer) participating in a computer network that uses the Internet Protocol for communication ;

  • a name server : a name server is a computer hardware or software server that implements a network service for providing responses to queries against a directory service (it turns a domain name into an IP address) ;

  • a registar : an organization or commercial entity that manages the reservation of Internet domain names ;

The IP address are unique but the name servers and registars can link the domain names to other domain names.

A graph model is ideal to represent these entities and their connections :

A graph data model for attack analysis made by Cisco

Here the domains, IP addresses, and DNS information are nodes nodes. They are linked by relationships. Here for example, we can see that domains A and B are connected through a shared name server and MX record despite being hosted on different servers. Domain C is linked to domain B through a shared host, but has no direct association with domain A.


Sample Data Set

For this GraphGist, we are going to use data from Cisco’s blog. It mixes proprietary information from Cisco’s data collection program and open-source information like DNS registries.


You can download the complete dataset here : https://www.dropbox.com/s/7vburpnl4yik8z1/Attack%20Analysis.zip

Which are the known domains controlled by hackers

We look for the domains that have a very negative reputation.

MATCH (baddomain:Domain_name)
WHERE baddomain.reputation = 'Very negative reputation'
RETURN baddomain.name as domain_name

What other domains are they connected to

We want to see what other domains the rogue domains share connections with.

The idea here is that these other domains might also be controlled by hackers. If it is the case, we want to blacklist them and prevent them for being used in other attacks.

MATCH (baddomain:Domain_name)-[r*2]-(suspiciousdomains:Domain_name)
WHERE baddomain.reputation = 'Very negative reputation'
RETURN suspiciousdomains

These domains could be controlled by the same hackers who sent the first mails. We should monitor them and block any mails including links to these domains.

Graph visualization allows us to quickly understand how the domains are connected and interpret the result of the investigation. In the picture below captured with Linkurious, we seen in black the know rogue domains and in pink the newly identified domains. The visualization show how exactly they are connected.

Visualizing the attackers

Conclusion

After the initial emails were sent, we collected the domains used in the phishing attack. We then used open source-information and graph analysis to identify potential links to other domains. Based on this analysis, we have identified potential threats…​before they become active. Cyber security is a huge challenge. Today, the emerging graph technologies offer new ways to tackle security data and use it to prevent attacks or react faster.

For more graph-related use cases, make sure to check the blog of Linkurious : https://linkurio.us/blog