Twin4j

This Week in Neo4j – Time Based Graph Versioning, Pearson Coefficient, Neo4j Multi DC, Modeling Provenance

Developer Relations Engineer

February 16, 2019

3 min read

Welcome to This Week in Neo4j, where we round up the last week in the world of graph databases. This week our Chief Scientist Dr Jim Webber describes how to run Neo4j in a Multi Data Center Environment, Max De Marzi shows us how to find the shortest path on a rail network, and Stefan Bieliauskas shows us why graphs are a perfect fit for modeling data provenance

Featured Community Members: David Stevens

This week’s featured community members is David Stevens, Global Technology Transformation Lead, DXC.

David Stevens – This Week’s Featured Community Member

David is the author of the DXC DigitalExplorer, an Enterprise knowledge graph built using Neo4j. The platform provides the means to understand, shape, and enable Digital Transformation projects, and David won a Graphie for his work at GraphConnect NYC 2018.

David presented The Enterprise Knowledge Graph Explorer as part of this week’s Neo4j Online Meetup, and also recently featured in the 5 minute interviews series.

He’ll will be presenting at GraphTour Madrid next week, so if you’re going don’t forget to say hi!

Learn more about David

Graph versioning Episode one — Time Based

Versioning graphs is a commonly asked question, and Tom Geudens has started a series of post explaining the different approaches.

Installment 1 focuses on time-based versioning of graphs. Using an e-commerce example, Tom shows how to separate identity from state, where the name of shops and the products that they sell can vary over time.

Read the blog post

Community detection of survey responses based on Pearson correlation coefficient with Neo4j

A couple of weeks ago we added the Pearson Similarity algorithm to the Graph Algorithms library, and Tomaz Bratanic wrote a blog post showing how we could use it to make sense of Kaggle’s Young People Survey dataset.

This dataset contains amongst other things, music preferences, phobias, and health habits, and Tomaz initially shows how to use the algorithm to work out correlations between the answers in these different categories.

He then builds a similarity graph of users based on their answers, and uses the Louvain algorithm to find communities of users, before creating a Gephi visualisation of those communities.

Read the blog post

Provenance with Neo4j, Playbook for graph database projects

Stefan Bieliauskas wrote a blog post showing how to model provenance data in Neo4j.
Mihai Raulea shared his playbook for graph database projects.
In calculating the best rail road paths, Max De Marzi shows how to write a procedure to find the quickest routes between train stations.

Running Neo4j in Multi Data Center Environments

Dr Jim Webber was back in the video studio, this time recording a video explaining how to run Neo4j in Multi Data Center Environments.

Jim describes how to configure Neo4j servers with metadata to optimise the way that data is both queried and moved between them. You can learn more about the concepts Jim covers in the Multi DC documentation.

Create a Data Marvel — Part 8: Controlling and Servicing our Comic Endpoints

Jennifer Reif‘s Marvel Series is back, and this week Jennifer shows us how to build the controller and service classes for handling requests and shaping results.

Everything is now in place to feed the data into d3 to better visualize the Marvel Universe in the next installment!

Read the blog post

Tweet of the Week

My favourite tweet this week was by Thibault Chevrin:

A constellation of chocolate products, brands & stores where we can find them, special #VDay ! #LoveNeo4j Thanks to @OpenFoodFacts for the data! pic.twitter.com/GCUpPewXx0

— Thibault Chevrin (@Draekenn) February 10, 2019