Neo4j Live: ICIJ Datashare – Turning Documents into Knowledge

18 Jun, 2024



Have you ever wondered how journalists extracted valuable information from the millions of documents of the Panama Papers or Pandora Papers ?

In this session, Clément, ML Engineer at the International Consortium of Investigative Journalists (ICIJ), will explain how Neo4j and Datashare (ICIJ’s search engine) helped solve this challenge.

You will learn how Datashare’s new Neo4j plugin allows journalists to turn investigation documents into knowledge graphs connecting the dots between people and corporate entities. You will then discover how to use Neo4j to surface valuable insights from such graphs: reveal complex links between entities, identify central actors, analyze exchanges, identify similar entities...

Guest: Clément Doumouro, ICIJ

ICIJ https://www.icij.org/
Investigations https://www.icij.org/investigations/

0:00 Intro
2:36 ICIJ and Panama Papers
5:30 Technical Approach
9:00 Datashare Demo
23:36 Extracting Named Entities
34:11 Exploring the Graph
41:08 Entity Resolution with Open Refine
47:53 Reconciliation API
1:01:30 Q&A

Datashare’s new plug-in helps investigative journalists connect the dots with graphs - ICIJ
https://www.icij.org/inside-icij/2024/02/datashares-new-plug-in-helps-investigative-journalists-connect-the-dots-with-graphs/

Tutorials:
ICIJ's Datashare Neo4j Plug-in Tutorial - Part 1 https://www.youtube.com/watch?v=Gpg6gi5se98
ICIJ's Datashare Neo4j Plug-in Tutorial - Part 2 https://www.youtube.com/watch?v=GOQSGpjBMS0

Datashare and plugin documentation
Datashare https://github.com/ICIJ/datashare-extension-neo4j-demos
Datashare Neo4j's plugin documentation https://icij.gitbook.io/datashare/usage/explore-the-neo4j-graph
GitHub - ICIJ/datashare-extension-neo4j https://github.com/ICIJ/datashare-extension-neo4j
GitHub - ICIJ/datashare: A self-hosted search engine for documents. https://github.com/ICIJ/datashare
Datashare Demo https://datashare-demo.icij.org/#/

Neo4j at ICIJ
How ICIJ deals with massive data leaks like the Panama Papers and Paradise Papers - ICIJ https://www.icij.org/inside-icij/2018/07/how-icij-deals-with-massive-data-leaks-like-the-panama-papers-and-paradise-papers/
Wrangling 2.6TB of data: The people and the technology behind the Panama Papers - ICIJ https://www.icij.org/investigations/panama-papers/data-tech-team-icij/

#neo4j #graphdatabase #icij #investigativejournalism #journalism #panamapapers

Related Videos