This Week in Neo4j – Data Lineage, Google Cloud, Thomson Reuters’ OpenPermID


Welcome to this week in Neo4j where we round up what’s been happening in the world of graph databases in the last 7 days.

This week we have a graph of Thomson Reuters’ OpenPermID dataset, running Neo4j on Google Cloud, migrating from MySQL to Neo4, as well as a data lineage talk from GraphConnect NYC 2017.


This week’s featured community member is Suellen Stringer-Hye, Linked Data and Semantic Web Coordinator at Vanderbilt University.

Suellen Stringer-Hye - This Week’s Featured Community Member

Suellen Stringer-Hye – This Week’s Featured Community Member

Suellen has been part of the Neo4j community for several years and presented her work using graphs to analyse digital humanities data at GraphConnect San Francisco 2015. She also presented Using Neo4j to Explore Nascent Research Networks with Clifford Anderson and Ed Warga at the 2015 VIVO Conference.

Suellen leads much loved graph database workshops at Vanderbilt and has helped attendees build GraphGists with their favourite data.

Suellen was also interviewed on the Leading Lines podcast alongside my colleague Michael Hunger in 2016.

On behalf of the Neo4j and humanities communities, thanks for all your work Suellen!

Pick of the week: Thomson Reuters’ OpenPermID Graph


Thomson Reuters’ OpenPermID

Thomson Reuters’ OpenPermID

In the first part of the post Jesus shows how to import the dataset using the neosemantics extension. The dataset contains 127 million triples which is turned into a graph of 18.8 million nodes and 101 million relationships.

Jesus then goes on to show how to query the graph to do complex path analysis and how to build nice charts on the output of those queries using standard BI tools. He finishes the post by showing how to build an RDF API on top of the graph.

You can get all the code from Jesus’ blog post from the openpermid2neo4j GitHub repository.

iPhone Database Browser, MySQL to Neo4j,


From GraphConnect: Real-Time Data Lineage at UBS


At GraphConnect NYC 2017 Wren Chan and Sidharth Goyal explained how Neo4j allows them to trace lineage of all metrics for all initiatives across the bank.



In the talk they explain how they built a system that syncs data between Oracle and Neo4j and does lineage generation using Cypher queries. A fun dataset of the UK royal family is used to explain how it all works.

Kubernetes on Google Cloud, Django, Excel


Next Week


What’s happening next week in the world of graph databases?

Date Title Group Speaker

February 8th 2018

Data Science in Practice: Importing and Visualizing Facebook Data Using Graphs

Neo4j Online Meetup

Ray Barnard

Tweet of the Week


My favourite tweet this week was by Nicholas P Moran:

Don’t forget to RT if you liked it too.

That’s all for this week. Have a great weekend!

Cheers, Mark