How Graphs Power OpenSanctions: The 5-Minute Interview with Friedrich Lindenberg

You’ve probably heard of the Panama Papers and the Pandora Papers, but you may not be familiar with OpenSanctions… yet.

OpenSanctions is the brainchild of Friedrich Lindenberg. It uses graph technology to analyze different datasets to identify significant individuals around the globe – like politicians, their associates, and other persons of interest – giving investigators, businesses, journalists, and technologists an essential open source tool to optimize their work.

The positive implications of OpenSanctions are far reaching and visionary. We were so delighted for the opportunity to talk to Friedrich about this project and what he sees for the future of graph technology. Enjoy diving into our conversation below.

Please introduce yourself and tell us about your work.

Friedrich Lindenberg: My name is Friedrich Lindenberg. I’m a freelance software developer, and I’ve recently started a project called OpenSanctions, which brings together data from persons of interest all around the world. I’m based in Berlin, Germany.

OpenSanctions basically consists of three different types of data sources. One part of it is sanctions data, as the name implies. So this is a sanctions list that is being released by governments all around the world, the U.S., the EU, the Swiss government, the Ukrainian government – anywhere in the world really.

The second part of it is we’re trying to build a database of politically exposed people. That’s like a fancy term for anyone who is a politician or linked to a politician – basically anyone who can spend public money.

And the third part of it is what I would call “persons of interest data.” So there’s a lot of people who are neither on a sanctions list, nor are they politicians, but still interesting for an analyst or researcher working on journalism or activism or financial investigations. For example, key oligarchs in different countries or power brokers, consigliere, all these kinds of people, data that we’re trying to bring together – who are those people in every country in the world?

Can you tell me about the origins of this project?

Friedrich Lindenberg: OpenSanctions is actually quite old. It started in 2015, and the idea, back then, was that we started seeing the first big leaks of information. So I’ve been working as a developer with investigative journalists for the past couple of years. And they would start seeing, for example, big email leaks being published or big leaks of offshore databases. So you’d have all the company owners in a particular country, and then you want to find out, what are actually the data points that I need to look at? What are the ones that are interesting from a journalistic point of view?

Very quickly, I started realizing I actually need to know which people are interesting. And then, I can kind of cross reference that with those data sets with an offshore company’s registry, with an email dataset, and see that these are the companies that I need to look at. These are the emails I actually need to read in that dataset.

From there, I’ve been working on building out databases of persons of interest, basically as a contrast agent for investigations. So if you think about visiting a doctor, maybe they’ll put you in a scanner and they’ll inject you with this kind of contrast agent that makes doctors able to see what’s going on in your body. And I like to think of persons of interest data like that. It’s kind of the material that you inject into a dataset to see where the interesting bits are.

Why did you choose to use Neo4j in your project?

Friedrich Lindenberg: I think graphs are really interesting as a validation step in what we are doing. So we’re trying to build this database of persons of interest, but really, it won’t have any value if we don’t combine it with other data. So we want to combine it with databases of beneficial ownership, with databases of banking data, of procurement data, where government makes contracts with the private sector. All these kinds of datasets that are activity related or that help us to expand outwards, in terms of connectivity. Then, when you combine it with those datasets, you can start looking at actual kinds of journalistic or analytical questions.

For example, a lot of questions are sort of prototypical patterns that represent journalistic stories. Can I find any links between a politician in my country and an offshore company that they might own, where they might have stashed away some illicit wealth? Or can I find any links between a sanctions list from a particular country and a government contract where the government is giving out tax money to a company that’s a subsidiary of one that is sanctioned international dealings? These kinds of patterns, then, become answerable once you kind of start combining the raw persons of interest data with data from other sources.Then you integrate it in order to find those connections and match these patterns.

What do you hope your project ultimately achieves?

Friedrich Lindenberg: One thing I’m really interested in is just supporting people who do really interesting and analytical work. So some of that is investigative journalists who, for example, are working in countries where democracy is being undermined by corruption, by grand theft. For them to be able to find out what the connectivities are and how business is really being done in their country, is, I think, super interesting.

But it also allows them to hold people to account and therefore, helps them fight for democracy in their own countries. And I think we’ve all seen that corruption can really be a threat to democratic institutions and eventually, to peace. So yeah, supporting these people who are really at the front lines of making sure that power is held accountable is really satisfying.

In a bigger sense, there’s also, I think, a kind of globalized system of illicit finance that is underpinning many of these individual cases. Budget money is being stolen in many countries, especially countries that are poorer and have weaker democratic institutions. Then it’s being stashed away and hidden in wealthier countries or in offshore jurisdictions. Being able to put that together and allow journalists and investigators from one country visibility into what’s going on on the other side of that system, to me, is a really fascinating way of empowering those people.

Who are the main users and contributors?

Friedrich Lindenberg: I think our main users basically fall into three categories. One is investigative journalists that are using this, often in terms of mining stories, getting leaks. Adjacent to them, there’s a broader industry or sector of open source intelligence analysts. People who try to kind of do case investigative work and meet this data as a way to find connectivity at different scenarios.

The third group we’re seeing more and more of is people who want to screen their own customers and see if any of them are known actors in these systems. Are any of them politicians? Or even on a sanctions list? And that’s people from a broad variety of business backgrounds, whether it’s in financial services or real estate or lawyers. All of these people, I think, are keen to find out what their risk exposure is.

What is some of the most rewarding work you’ve done as a civic technologist?

Friedrich Lindenberg: What’s an interesting answer to that? I think working with investigative journalists has always been really, really fascinating, working on those cases where you have vast international schemes of dark money flowing around. Also, you get to understand the consequences of that on either side, whether it’s here in Berlin, where the real estate prices are going through the roof, because foreign capital is arriving and needs to be stored somewhere. Or whether it’s in countries where the money is being stolen, frankly, where there’s a lack of healthcare, education, infrastructure budget, because money is being taken away.

One particular set of projects I was really interested in working with was around the murder of journalists. There have been a couple of cases in Malta, in Slovakia, where journalists have been killed. Then there’s this rallying of journalists who basically pick up their stories and make sure that when such a thing happens – which would be an absolute no go, obviously – that the stories don’t die with the reporters. Rather there’s going to be an alliance of all of the world’s best journalists coming after you and finishing what could not be finished previously. And really making sure that killing a journalist is always a recipe for disaster and a recipe for accountability.

How does looking at the data from different angles help investigators?

Friedrich Lindenberg: What I found really interesting is that, to me, I’ve always looked at sanctions from a kind of network point of view, from a graph point of view. So trying to understand the connectivities between political actors, sanctions, individual sanction companies, etc. But in the last couple of months, there’s also been this massive emphasis on understanding how this evolves over time, understanding when different countries impose sanctions on different actors.

This has almost become a day-to-day business now, trying to understand, well, the U.S. is kind of taking the lead on sanctioning the board of Gazprom. Then it’s going to take a week for the Europeans to catch up and maybe two weeks for the Swiss to follow along.

That kind of time access has been really interesting in understanding this. Also as a mechanism that’s happening more and more where, for example, there’s bodies within civil society. But also, for example, the Anti-Corruption Agency of Ukraine – they are essentially making a draft list of entities that they think should be sanctioned, and they submit them to the governments in the U.S. and in Europe and the U.K. It’s interesting to see that cycle between what civil society and the Ukrainian government is suggesting and how it gets implemented by governments in the west.

What are some of the most alarming things that you’ve uncovered with graphs?

Friedrich Lindenberg: I think what’s really alarming is how much stuff you can find just by taking this data, throwing it together with virtually any other registry in the world, and running a little bit of data integration. I always feel this should be harder, but you can do that. You throw it together, you run a few path queries, and you find more cases of interest than relevant journalists in the country are probably going to be able to dig through in the foreseeable future. So it’s actually somewhere between technically super satisfying, and as a citizen, super concerning how many of these cases there are.

Sometimes they’re sitting there stuck with a really good lead and it’s like, “All right, do you want it?” “Oh, no, I’m busy. Can you give it to someone else?” “Do you want it?” “Yeah, that sounds really interesting. I just don’t have time right now. I’m stuck on this project.” I was like, “All right. So there’s a good story, but nobody to do it.” It’s really weird.

How are current geopolitical events affecting your work?

Friedrich Lindenberg: I got into sanctions, because I thought it was a cute little dataset to do well. I didn’t expect this to become the center of global policy making. I didn’t expect it to become a thing where there would be 300 people sanctioned on a Tuesday each week. So it’s just been an amazing boom of what’s happening there, both in terms of what entities are being sanctioned and also in terms of more and more people actually trying to follow through on it.

Whether it’s international task forces trying to really develop good maps of Russian assets in different countries, whether it’s law enforcement or civil society groups that have really jumped onto this and are now trying to make good on what has been the law for a long time now – which is that assets linked to these individuals need to be seized. It’s interesting.

What are some recent challenges that have impacted your work?

Friedrich Lindenberg: By far the most challenging thing we are actually trying to do is to build a database of all politicians in the world. That’s kind of fundamentally an insane project. The way we’re trying to do it is to work with the Wiki data community who built this massive graph of structured information in parallel with Wikipedia, with the other Pedias.

One of the big things we’ve been trying to do is basically build a warning network where we have little scripts that monitor hundreds and hundreds of government websites every night and see if one of these websites changes and announces, for example, a new cabinet member in the country, or a new parliament having been elected in another country. And then try to sync that up and make sure the data in Wiki is up to date.

That’s really interesting also, in terms of seeing it as a collaborative project, where it’s not just something that’s a single set of scrapers are running, but rather it’s kind of a community-based process that is much more about inviting people to participate, to take a stewardship of the data for their country, to track, for example, who’s in charge of this cabinet in my country, or who is in the parliament, who’s the military leadership, or even in charge of the central bank? All of these different data points want to be collected. We’re trying to see if we can build a community worldwide that is interested in doing this, in terms of political accountability, but also just in terms of providing high quality information to people who want to do further analysis.

What do you see for the future of graph technology?

Friedrich Lindenberg: I think there are a lot of interesting things that graphs need to explore. One of them is being really good at doing probabilistic stuff. Whenever we do these links between different datasets, they’re always combined with how likely this person is to be this member of parliament from our dataset.

I think that’s a really interesting thing to play with, allowing non-technical users like journalists, like analysts, to compose more complicated queries to provide narrative patterns and then have them run as queries on the data. Basically, “Hey, I want to see all the politicians who are linked to an offshore company in Luxembourg. Can I please have that?” – being able to make that translation easier.

But I think also a lot of it is with regards to data volumes growing. Open data, and even the amount of data that’s just publicly available on the internet, is absolutely exploding. And I think graphs are going to be instrumental in putting it all together. But it’s also a massive challenge, in terms of the scale of the data that needs to be incorporated.

Take a closer look into the powerhouse behind the analysis of real-world networks: graph algorithms. Read this white paper – Optimized Graph Algorithms in Neo4j – and learn how to harness graph algorithms to tackle your toughest connected data challenge.

Get My Free Copy