BloodHound: How graphs changed the way hackers attack

Adversary Resilience Lead at SpecterOps

April 17, 2019

19 min read

Discover how graphs revolutionized how hackers attack.

Editor’s Note: This presentation was given by Andy Robbins at GraphConnect New York in October 2017.

Presentation summary

SpecterOps provides adversary simulation, adversary detection and adversary resilience to companies looking to assess their current cybersecurity measures. Their experts have worked in defending government agencies as well as worldwide enterprises in the financial services, healthcare, technology, media and communications industries.

In this presentation, Resilience Lead, Andy Robbins will dive into how graphs have changed the way hackers attack. He acknowledges some relevant prior works, like Active Directory ACL Scanner by Robin Granberg and a very important French work from ANSSI, and also details how hackers attack corporations in four simple phases.

Recon Phase
Initial Access
Post–Exploitation Phase
Exfiltration

After a few final thoughts on the post-exploitation phase, Andy explores identity snowball attacks, the creation of BloodHound and SharpHound, as well as attack path automation. He also discusses the production of two main projects: ANGRYPUPPY by Calvin Hedler and Vincent Yiu and GoFetch by Tal Maor and Itai Grady. These two projects, and attack path automation in general, have the potential to completely change the face of cybersecurity.

Full presentation: How graphs have changed the way hackers attack

Today, Andy Robbins is going to be talking about how attackers get into large corporations, access the information they are looking for and get out in incredibly quick amounts of time.

My name is Andy Robbins Andy Robbins, and I am the adversary resilience lead at SpecterOps SpecterOps. In short, I help organizations figure out how attackers can move through a network. Then I help companies figure out how to shut down those attack paths as effectively and cheaply as possible. This can be drafted via an attack graph.

Full disclosure, you should probably know that I do not have any kind of formal mathematics or computer science education. However, I’ve been a professional penetration tester and red teamer for the past five years. I got paid to break into organizations, steal their data, then write up a report and leave.

Previously, I didn’t have to help them fix their issues – now I do – so it’s not quite as fun as it used to be.

I’ve spoken at the Black Hat Conference, DEF CON, ISSA International, (ISC)², ekoparty down in Buenos Aires and the Paranoia Conference in Oslo. I give training at Black Hat USA, as well as Black Hat Europe. This was the first year we were invited to teach in Singapore at Black Hat Asia.

We teach red team tactics, we teach how adversaries actually break into networks and finally we teach people how to do these tactics in addition to private training.

Notable prior work

A prior work that I would like to acknowledge is a white paper from Microsoft called Heat-ray: Combating Identity Snowball Attacks Using Machine Learning, Combinatorial Optimization and Attack Graphs. This paper was released in 2008 by Alice Zheng, John Dunagan and Daniel Simon.

Unfortunately, the project never saw the light of day. Microsoft just shut it down. They couldn’t figure out how to make money off of it. It is more than likely in a crate in a basement somewhere, never to be seen or heard of again.

Another work I’d like to acknowledge is Active Directory Control Panels, this is a French project from ANSSI, which is a part of the French government. This is an important work where we learned a little bit more about graph theory itself, how vertices and edges relate to one another, and how to actually conceive of a graph design in an effective way.

I’d also like to acknowledge Active Directory ACL Scanner by Robin Granberg at Microsoft, this helped us understand more in depth about how ACLs work in Active Directory.

No talk about attack graphs, in my opinion, is complete without this quote. This is by John Lambert who’s a general manager of Threat Intelligence at Microsoft.

John said, “Defenders think in lists. Attackers think in graphs. As long as this is true, attackers will win.” This is the title of a blog post that John put out two years ago.

The idea here is that, as defenders, you have a list of assets, a list of users. You do vulnerability scans, you get a list of vulnerabilities. But as we know, disconnected data is not nearly as valuable as connected data.

So what do those vulnerabilities actually mean? What systems are they on? Who uses those systems? What data is on those systems?

A list-based mentality doesn’t answer those questions whereas obviously a graph does.

How hackers attack

Let’s dive into some specifics: How do hackers actually attack corporations?

There’s a new story almost every week from Experian, Target, you name it. There are Fortune 50 companies that get breached all the time. The common phrase that you hear is, “This was a sophisticated attack, and there was nothing that we could do to stop it.”

In our world, what that actually translates to is, “We were too lazy to do anything that our auditor told us to do, but we hope that you’re going to believe us when we say this was like some nation-state level attacker or something like that.”

I’m going to show you how these attacks actually work, so you see how little sophistication is actually required.

Step 1: recon phase

Any corporate attack will follow a four-step methodology. You’re going to have a recon phase, an initial access phase, a post-exploitation phase and an exfiltration phase.

Recon means I know that I want to hack a certain organization but how do I go about it? Let’s say we want to hack Neo4j, and I want to get the source code for the next version of Neo4j before it comes out, or I want to be able to cut myself enterprise licenses for free.

To do that, I need to have access to a certain system at Neo4j. I don’t know what that system is nor do I know where it is. It’s probably in their LAN somewhere, and there are probably people who have access to this.

So I need to figure out who those people are. Who works there? What do they do? How long have they worked there? What’s their job history? What can I find out about the individuals who work there to put together some kind of convincing phishing attack later on?

LinkedIn is a social engineer’s best friend or a hacker’s best friend. On LinkedIn, as you can see below, I’m connected to Michael Hunger as a first-degree connection.

An example of hacking via LinkedIn connections.

I can see the full name, full profile, job history, education and endorsements of any second-degree connection. Even while I only have one connection with Michael at Neo4j, he’s connected with almost everybody at the company and I can therefore see all of their names.

I can put that together and I can form an email list. I can find out what their email address schema is from their website, or their ICAN registration, or a contact us page. I can put together a full listing of email addresses to be used for a phishing attack.

Let’s say I want to get really specific. What if I want to phish one individual person? To Max, I’m a second-degree connection, but I can still see everything in his profile. I’m going to look up what information I can Google about Max. I can see his education. I can see he went to Florida Institute Technology and that he got a B.S. in Computer Science.

What’s interesting is, he was in a fraternity called Tau Kappa Epsilon. So I’ll do a little research about that. I can see that at Tau Kappa Epsilon, Christopher Niles is the Alumni Engagement Director. The whole point of this is that I want to find out as much information I can about the organization.

I’ve been focusing on people, but I can also look at technologies. What are their public IP ranges? What systems do they use? Do they use Macs? Do they use Windows? Do they use Active Directory? I can find out all of this information in my recon phase.

Step 2: Initial access phase

We have decided not to target Max, but instead we’re going to put together some kind of scheme involving this guy and this fraternity. What we want is initial access. We assume the data we want or the system we want exists within the Neo4j LAN.

Here’s what we know so far: We know that in the image below our target is on the right, the attacker is on the left and Neo4j owns this target asset.

Because of the miracle of firewalls, we don’t have immediate logical access to that target system. So we need to get some kind of initial access into the LAN.

What else exists within a LAN is user workstations. In the vast majority of organizations, the internal network topology is completely flat. Meaning that once you’re on the network, you can just touch other systems regardless of where else they are or what risk level they present to the organization.

I may not have the ability to authenticate to that system, but I can ping it. I can hit on a certain port; I can SSH to it and try to guess a password or something like that. I have logical access to the system if I can get access to that workstation.

I’m going to put together an email that looks something like this:

A sample email a hacker would send to a LinkedIn contact.

I purposefully make the due date today so that Max doesn’t have time to vet this guy or call him. He’s got to complete this document right now if he wants to be a Distinguished Alumni.

As you can probably guess, the document attached to this email is going to be a little malicious. And there are a lot of options as an attacker for what we can do. We could have it literally be a binary – an executable or something like that – or a Python script.

If we want to be a little sneakier – and a little more effective and bypass spam controls and antimalware controls – we could roll it into some kind of other document that can then execute that script for us. The bottom line is that we want Max to run a script for us, and we want him to be encouraged to do it and believe that what he is doing is good for him.

That script is going to run on his workstation, and it’s going to make an HTTPS connection back to me as the attacker. And this is going to look very similar to a web request. There are tons of different transports I could do, and you can have that as your command and control transport.

Now I have access into the LAN, assuming that Max double-clicks on the attachment and that the anti-malware doesn’t catch it, etc. The easiest phase of an attack is getting access to the network.

Step 3: Post-exploitation phase

Now that we have access, we’re in the post-exploitation phase.

Post-exploitation is what separates good pentesters from great red teamers. Anybody can get access to a network; it’s very, very simple. However, having the ability to go from low privilege to very, very high privilege, getting access to your target – that’s a little more difficult and more skill-based.

When we land in a network we don’t know anything about, what we need is situational awareness. We need to know what computers exist, what users exist, what file shares exist, where are users logged on, etc.

The beauty and magic of Active Directory from Microsoft is that any authenticated user in Active Directory can figure out almost all of that information. PowerView is a solution for this.

PowerView is a post-exploitation situational awareness framework by Will Schroeder. It’s written in PowerShell. It’s PowerShell v2 compliant which means that if we land on any system that has PowerShell running, we’re going to be able to use this.

Next we are going to use a tool called Mimikatz. If you are a local admin on a Windows workstation and your process integrity is what’s called high integrity, and if somebody else is logged on to the same system, you can get their password out of memory using Mimikatz.

Using Mimikatz, you can extract a user’s password out of memory in plain text no matter how long it is, no matter how complex it is, no matter what the parameters of the password are. You can extract that password out of memory in plain text. Mimikatz is written by Benjamin Delpy
with other contributions by Vincent Letoux.

You can see an example of plain text password pulled from memory below:

Grabbing a password from memory in plain text.

If I can get local admin rights in a system where an interesting user is logged on, I can get their password, and become that user and I can continue on in my attack path. The software that is actually running on the computer that is accepting taskings and then sending the information back to my control server is called an agent or an implant. Three good ones are Beacon, Empire
and Meterpreter.

Here is a little about Mimikatz in action.

Step 4: Exfiltration phase

After I go through the escalation process, I can then authenticate my target.

I get access to it, and then I’m ready for exfiltration. The easiest way to do that is to ride my existing transport back out of the network. So get access to the Neo4j 4.0 source code or the Enterprise Edition license and then shoot that back to my system over my existing transport.

A few more thoughts on post-exploitation

Post-exploitation can be very easy, or it can be incredibly difficult depending on the maturity of your organization’s security posture.

For example, a lot of organizations just give everybody local admin rights, and they have the same password for the local admin user on every workstation. In that instance, if an attacker gets in and they become a local admin, that’s it. It’s over. They’re going to have access to everything.

However, a lot of organizations are not set up that way. Users don’t have local admin rights. The local admin password is different on every system. In that situation, what we have found is that throwing exploits is very risky for two reasons.

One, we might knock systems over, which can be bad. As a consultant, that’s bad client management. It can also trip an antivirus program. This can trip other alerts, and it could bring attention to ourselves, which we don’t want to do.

Secondly, situational awareness is very time-consuming and very tedious. Figuring out where I’m an admin is very easy because I just try on all systems. Figuring out for a different user at a time is nearly impossible.

What we found was that there was a very common and very reliable pattern that emerged. As we did these penetration tests and attacks over and over, we found patterns. You can see these in breach reports for Target, Fannie Mae and other organizations where they’ve been made public. This is called a derivative local admin or an identity snowball attack.

Identity snowball attack

Here is a brief walkthrough of what an identity snowball attack might look like:

Introducing BloodHound

A friend introduced me to graph theory. I learned about nodes and relationships, Dijkstra’s algorithm and the A-star algorithm. I came up with a proof of concept to automate this workflow, and it worked.

So instead of going through this process manually, we could use an attack graph, and we could graph all this stuff out. The result was this project called BloodHound.

The first step with any graph, in my opinion, is the design of the graph. It needs to represent the system effectively, it needs to be efficient and it needs to scale. It needs to be somewhat future proof.

The considerations that we had when designing the graph was first of security group delegation. This means that if I add myself to a group and that group is a local admin on a computer, I’m a local admin on that computer and that extends indefinitely. You can have groups added to groups, added to groups, added to groups. And that group at the tail end – whatever privilege it has – rolls back all the way.

Second is user credential theft. If we can get to a computer, we can steal user credentials and user credential availability, i.e., where users are logged on. We had to take that into consideration as well. I wanted to keep it simple, and I wanted to keep it flexible enough for future additions to the schema. The result was very simple as you can see below:

The user on the very far left can be a member of a group or it can be an admin to a computer. A group can be a member of a group, or it can be an admin to a computer. A computer can have a session for a user.

A path emerges from the user on the bottom left to the user on the bottom right along the AdminTo relationship and then the HasSession relationship. The simplicity of this data model allowed us to write very simple queries.

So here are two Cypher query examples, and I’ll show what this actually looks like in the interface as well.

Line one is effectively group memberships for a user. I do an unbounded search from an origin user to an arbitrary group-labeled node. Then for effective local admin rights for a user, I traverse MemberOf and AdminTo nodes, so that if I follow those security group delegations all the way, I get to an AdminTo relationship, and I follow those indefinitely.

More often than not, the last relationship that this follows is AdminTo. So very simple. Very, very easy. Effective add-ons for a computer as well, very easy.

I am always interested to know where users that belong to a certain group are logged on.If there is a group called SQL Admins, and they operate on a SQL server, then we want access to that server. I want to know exactly where the users for that group are logged on. This query (below) shows us the group that we designate at the very end here.

We see who the effective members of that group are and the computers that have sessions for those users.

Data collection: SharpHound

To get this data we have a tool called SharpHound
, written by Rohan Vazarkar.

It’s based on the original PowerShell collector by Will Schroeder (mentioned above). It’s a C# compiled binary which you can also run on PowerShell using .NET reflection.

Using SharpHound, as an attacker, we can stay completely memory resident. This is interesting for us because most antivirus won’t analyze the contents of memory. They’ll only analyze what gets written to disk. If we stay off disk, we can evade most antivirus programs.

SharpHound has the ability to collect info from domain controllers via LDAP or SMB and then domain join workstations via SMB as well. The data that we’re collecting is local admin group memberships across the enterprise, where the users are logged on across the enterprise and security group memberships.

Like I’ve said a couple of times, any user in the domain can collect this information. You don’t have to have any kind of existing privilege to collect this.

The BloodHound UI is built on top of Linkurious, and it’s compiled into an electron app so it gives us cross-platform ability very simply.

We interface with Neo4j using the JavaScript driver from Neo4j. And then we do chunk CSV injection through the UI. So we have a bunch of user-friendly options because we don’t want people to have to live in the Neo4j console.

Because they’re pentesters, they don’t care about the graph at all. They just care about the path that they get. So we do CSV injection for them to the UI in chunks. And then we also have a modal add-in, so that people can do their own raw Cypher queries and have the interface draw the result they graph or nodes.

Here is an example of what the interface looks like in action:

Attack path automation

Attack path automation revolutionized things to the point that people were actually starting to do attack path automation, which for corporate defenders is a terrifying concept.

There are two main projects that stand out in terms of attack path automation. There’s one called GoFetch
by Tal Maor
and Itai Grady
from Microsoft
from the ATA team. And there’s another project called ANGRYPUPPY from Calvin Hedler
and Vincent Yiu, formerly from MDSec
Tal Be’ery
and Tal Maor put on a notable talk at Black Hat Europe this year called The Industrial Revolution of Lateral Movement.

They touch on the fact that attack path identification and automation have the potential to completely change the face of cybersecurity. Before GoFetch or ANGRYPUPPY were released, attackers would get into a network, and they would go through that painstaking process.

They would have a lot of opportunities to make themselves known to defenders, and defenders would rely on attackers moving very slowly. Now, a process that used to take us two-and-a-half weeks takes us an hour, thanks to the attack graph.

Now, for defenders, there’s almost no hope because attackers can identify the attack path, execute it and, have our persistence laid into the network in such a way that the defenders can never root us out. We can do it in a morning, whereas it used to take weeks or months to accomplish.

For defenders, the speed at which graphs enable attackers to hack now is terrifying.

Ready to take your graph analytics to the next level? Click below to get your free copy of the O’Reilly Graph Algorithms book and discover how to develop more intelligent solutions.

Download My Free Copy