Neo4j and the Offshore Leaks: the Case of Azerbaijan

The ICIJ Offshore Leaks Dataset

A consequence of the Firepower scandal of 2013, the Offshore Leaks dataset released by the International Consortium of Investigative Journalists (ICIJ) was a rarity in the compliance/due diligence world, akin to the Cablegate or the release of the Pentagon Papers. In contrast to the relatively organized and easy-to-parse government leaks, the Offshore Leaks dataset is the result of the often manual extraction of data from a detailed collection of leaked from e-mails and other documents. The dataset is a treasure trove of information about offshore financial centers and tax havens. ICIJ’s investigations brought to the surface many interesting patterns, including the potentially illegal activities of the President of Azerbaijan.

This graph gist was built to make sense of the complex data in the Offshore Leaks dataset.

ICIJ’s fundamental lesson from the Offshore Project data has been patience and perseverance. Many members started by feeding in lists of names of politicians, tycoons, suspected or convicted fraudsters and the like, hoping that bank accounts and scam plots would just pop out. It was a frustrating road to follow. The data was not like that.

But persistently following leads through incomplete data and documents yielded some great rewards: not just occasional and unexpected top names, but also many more nuanced and complex schemes for hiding wealth. Some of the schemes spotted, although well known in the offshore trade, have not been described publicly before. Patience was rewarded when this data opened new windows on the offshore world.

— Duncan Campbell
How ICIJ’s Project Team Analyzed the Offshore Files

The Goal

We want to explore how the President of Azerbaijan (for example) is connected to offshore accounts. Why does this matter? Azerbaijani law forbids state officials involved in overseeing business from being involved in business themselves, including being shareholders in companies. In order to understand his dealings, we need to focus on the network he uses to control his assets stored in offshore entities. This network includes family members, companies, addresses, and a complex set of intermediaries and partners.

Why the presidential family established these companies is unclear. What is clear is that the family took steps that obscured its involvement in the companies, using various agents to register the companies and direct them, at least on paper.

— Stefan Candea
Offshore companies provide link between corporate mogul and Azerbaijan’s president

The Graph


  • Person: Persons are individuals building and using the asset network. Although some people are are quite visible, others are working behind the scenes.

  • Company: Companies include offshore entities, banking services providers, and businesses

  • Address: Addresses are the locations registered to people and companies. As they have legal implications (a company registered in a financial haven pays lower or no taxes), addresses can provide interesting insights.


  • (Person)-[:USES_ADDRESS]→(Address) and (Company)-[:USES_ADDRESS]→Address): People and companies are connected to addresses.

  • (Person)-[:FAMILY]→(Person): People can be connected via family ties. In this model, family relationships are very simple-either two people are family or they are not.

  • (Person)-[:IS_LINKED_TO {role:'', date:''}]→(Company): Links between People and Companies have the properites "role" (for example, Director or Shareholder) and date, which mark how the person is related to the company and the date of the connection.

  • (Company)-[:IS_LINKED_TO {role:'', date:''}]→(Company): The :IS_LINKED_TO relationship between Companies and Companies similarly has the properites "role" (for example, Master Client or Records and Registers) and "date", which mark how the first company is related to the second company, and the date of the connection.

  • (Company)-[:IS_OFFSHORE_PROVIDER_OF]→(Company): Offshore providers are usually selling their know-how, contacts, and favorable tax situations to the people who want to take advantage of the offshore system.

Example Schema

A graph data model of the ICIJ Offshore Leaks dataset
Figure 1. A graph data model of the ICIJ Offshore Leaks dataset

John and Sam are married and have stored assets in a company they control (both are shareholders and John is a Director): Treasure Ltd. John and Sam used an address in Dubai and established Treasure in the Bahamas, making the assets controled by Treasure private and tax-free. In addition, Treasure was set up with the help of two companies: Good Advice Inc and Hide and Seek. Oleg, a business partner of John with an address in Russia, is also a Director of Treasure Ltd.

This schema is one of the many ways we can model the ICIJ dataset. Although this exact example was not present in the original dataset, they have been included to highlight the importance of interpersonal relationships in the realm of shell corporations and tax havens.

Sample Data Set

What Assets Belong to What Person

President Ilham Aliyev’s Direct Assets

We look for the direct links between the President and offshore accounts

MATCH (president:Person {first_name:'Ilham'})-[r]->(account:Company) // Find a Person with first name 'Ilham' that is one hop away fom a company
RETURN as Company, account.form as Form, account.incorporation as Incorporation, account.status as Status, as Date, r.role as Role

THe first line of the query searches for all instances of a Person named Ilham one hop away from a Company.

The second line returns basic information about the company and the characteristics of the relationship between the Person and the Company. In this particular case, we see that Ilham served as Director and Shareholder of Rosamund International Ltd, a Standard International Company incorporated in 2002.

President Ilham Aliyev’s Indirect Assets

People who are trying to hide money tend to use proxies they can hide behind. That means that we must enlarge our search and look for indirect connections.

With a Neo4j database for example, finding all the foreign assets Ilham Aliyev controls directly or indirectly is as simple as adding a * to our first query. The search will return all the paths in the data between Ilham Aliyev and offshore accounts.

MATCH (president:Person {first_name:'Ilham'})-[r*]->(account:Company)
RETURN DISTINCT as Company, account.form as Form, account.incorporation as Incorporation, account.status as Status

The Role of Middlemen

Some middlemen might be particularly well connected and/or important to President Ilham Aliyev. We can use Cypher find every company in the president’s extended network. We can then find all companies associated with these companies and quantify how tightly connected these middleman companies are to the in-network companies.

MATCH (president:Person {first_name:'Ilham'})-[r*]->(account:Company)
WITH account
MATCH (account)-[t]-(middlemen:Company)
RETURN as name, count(DISTINCT t) as mentions, type(t) as type, t.role as role
ORDER BY mentions DESC