By Caroline Scharf & Uli Foessmeier, Tom Sawyer Software | October 12, 2016
Tom Sawyer Software is a Silver sponsor of GraphConnect San Francisco. Meet their team on October 13-14th at the Hyatt Regency SF.
The Offshore Leaks Database Challenge
The Panama Papers investigation and resulting Offshore Leaks database present an interesting challenge for investigators.
If you’re not familiar with this investigation, it was led by the ICIJ – The International Consortium of Investigative Journalists – to expose the people behind companies and trusts incorporated in tax havens. While some offshore entities and trusts are legitimate, their anonymous nature more easily facilitates money laundering, tax evasion, fraud and other crimes. For more information about the Offshore Leaks database, visit offshoreleaks.icij.org.
The Offshore Leaks database contains more than 320,000 entities and often times duplicate entries. Navigating the massive amount of information, visualizing it in a format that can be digested and understood, and knowing what clues to look for are all unique challenges for anyone using this database.
Tom Sawyer Software specializes in helping businesses rapidly build sophisticated enterprise graph and data visualization applications to help make sense of and analyze their Big Data, such as the volume of information in the Offshore Leaks database.
In this first of two articles, we walk you through our Panama Papers example application, built with our flagship product Tom Sawyer Perspectives. We discuss two scenarios that can help you make sense of the Offshore Leaks data, so you can focus your investigation on suspicious people and companies, spot areas of potential fraud and make connections.
Using Tom Sawyer Perspectives to Focus Your Investigation
When you begin an investigation, you may know the person or network of people you want to investigate, such as a well-known political figure or celebrity, or you may know several individuals who you suspect are connected, or the name or address of a company.
Using the 2015 FIFA corruption scandal as a backdrop, we want to find out if there are any connections between those charged in that case, and any other prominent FIFA-connected individuals.
In searching the database for all the individuals indicted in the case, three names show results: Eugenio Figueredo, Hugo Jinkis and Mariano Jinkis. In the case of Figueredo, a number of results come back with the name Figueredo.
Data integrity is common in this database, so we included a feature in our example application to automatically merge nodes with identical names, and the ability to manually merge nodes. We merge the two identical nodes and delete the nodes with different first names that we know are not relevant, but we’re not sure which of the remaining three “Eugenio Figueredo”s are valid, so we decide not to merge them until we are more certain.
Our example application shows a number for each of the nodes, which indicates the number of connections between each person and other entries in the database. We start to load these individuals’ connections to build out the network.
We decide not to load connections of intermediaries for now, because they typically have many connections and can clutter our diagram. It also seems doubtful that intermediaries and their connections would lead to any factual connections between two companies simply because both were created by the same intermediary. So we continue focusing on connections between people, companies and addresses.
After expanding the network several times, we see that two of the Figueredo results are indeed connected, and one is not. So we remove the unconnected group from our drawing and merge the other two. However, after expanding the connections as far as we can, we still do not see any connection between Figueredo and either of the Jinkis’.
Undeterred, we decide to expand the intermediary PGA Consultores that is in the Jinkis network. We expand the 34 entities and there is still no connection to Figueredo, but we decide to expand its entities one by one.
Bingo! The entity
LEONIDAS PROPERTIES S.A.shows a clear connection between the Figueredo and Jinkis networks. Knowing we are on the right track now, and given that there are only 33 more entities, each with only a few connections, we continue to expand and grow the network.
As we do so, our powerful graph layout engine automatically lays out the connections in a readable format, and we easily spot data integrity issues along the way. We see the name of the person connecting the two networks, El Portador, is misspelled a number of times. We merge the nodes as we find these duplicates, and continue expanding. Each time, we look at the names and entities of the connections that are revealed.
After a few minutes, a name catches our eye: Damiani. We know that Juan Pedro Damiani is a member of FIFA’s Independent Ethics Committee. There is a J. P. Damiani and Associates intermediary and a Juan Pedro Damiani Sobrero, both from Uruguay where the committee member lives. Now we are really onto something!
Using our people network and running betweenness centrality, we see the person El Portador is central in this network between Figueredo, the Jinkis’ and Damiani. This seems like a likely place to continue our investigation and dive a little deeper to understand the connection.
Read more about the alleged connection between Damiani and those individuals indicted in the FIFA scandal in this ICIJ article.
The Power of Tom Sawyer Perspectives
As we’ve illustrated in this article, the power of Tom Sawyer Perspectives lies in revealing the hidden connections in a visually understandable way, whether it’s among members of an organization, elements in a network, systems in an aircraft or automobile, or vendors in a supply chain.
Tom Sawyer Software specializes in helping clients with needs in link analysis; network topology; schematics and models; and dependencies, flows, and processes. We help them federate and integrate their data from multiple sources, and build the graph and data visualization applications that are critical to analyzing and gaining insight into their data.
Visit www.tomsawyer.com to access our Panama Papers and other demonstrations built using Tom Sawyer Perspectives, and to learn how we can help solve your visualization and analysis challenges.
Get My Ticket
About the Author
Caroline Scharf & Uli Foessmeier, Tom Sawyer Software
Caroline Scharf is Operations Director at Tom Sawyer Software and is responsible for ensuring efficient and seamless operation across departments. She works to develop and enhance internal applications to maximize operational efficiency. She has more than 20 years of experience in various roles in product development at enterprise software companies.
Uli Foessmeier is a Computer Scientist from Munich, Germany with a focus on graph layout and visualization. He holds a PhD from the University of Tuebingen and has 20 years of experience managing software development teams. He is a consultant at Tom Sawyer Software.
From the CEO
Have a Graph Question?
Reach out and connect with the Neo4j staff.Stackoverflow
Share your Graph Story?
Email us: firstname.lastname@example.org