Welcome to this week in Neo4j where we round up what’s been happening in the world of graph databases in the last 7 days.
This week we have a first look at Neo4j Morpheus (a tool for weaving together graph and relational data in Apache Spark), a demo of Neo4j on Google Cloud, profiling procedures with JMH, algorithms with Dr Jim Webber, exploring the World Cup with Neo4j Bloom, and more!
Featured Community Member: Estelle Joubert
This week’s featured community member is Estelle Joubert, Associate Professor at Dalhousie University. Estelle is a musicologist, which means that she studies music and culture, and has a particular focus on opera and political theory.
Estelle Joubert – This Week’s Featured Community Member
Estelle is the principal investigator for a large-scale team-run project entitled Opera and the Musical Canon, which uses Neo4j. The database is used to visualize relationships between people (composers, singers, publishers) and operatic objects (scores, reviews, images), offering a nuanced ‘picture’ of how operatic fame was generated prior to 1800.
Estelle was recently interviewed on the Graphistania podcast where she explains the project in more detail and describes some of the queries the projects aims to answer which made them switch over from a relational database.
You can listen to the interview below or read the transcript.
On behalf of the Neo4j community, thanks for all your work and good luck with your project Estelle!
Weave Together Graph and Relational Data in Apache Spark
In the talk they explain this upcoming tool which will make it possible to weave together graph and relational data in Apache Spark. They go on to explain use cases where this approach will work well, such as getting a 360 degree view of a customer, before doing some live demos of the technology.
Neo4j Morpheus is still in early access mode, so send us an email firstname.lastname@example.org if you’d like to participate and we’ll get you in touch with the right people.
English → Cypher, Neo4j on Apache Zeppelin, Profiling procedures
- In David Mack‘s latest post he shows how to translate English to Cypher using the magic of sequence-to-sequence translation. The trained model is able to perform reasoning tasks such as recalling properties of objects, counting, finding shortest paths, and determining if two objects have a relationship.
- The video from Gerrit Meier‘s talk at Spring I/O 2018 – Time to graph up with Spring Data Neo4j – is now available.
- Late last week version 0.8 of Apache Zeppelin was released. This version of the popular notebook software has Neo4j support.
- If you’ve ever found yourself wanting to benchmark your stored procedures or functions you’re in luck! Stefan Armbruster has created a project which shows how to use JMH to do this. You can also optionally create flame graphs of the output.
Google Cloud – Neo4j Causal Cluster Launch Demo
David Allen has created a video in which he demos how to launch a new Neo4j Causal Cluster on Google Cloud Platform.
This uses the Launcher capability of Google Cloud along with the Deployment Manager to deploy a cluster of three Neo4j instances for highly-scalable graph queries.
You can also read David’s guide for launching a Neo4j Causal Cluster on Google Cloud Launcher
Algorithms with Dr Jim, How the ICIJ deal with data leaks, New release of Neode
- Dr Jim Webber has started a series of videos where he explains the data structures and algorithms used by Neo4j. He starts with a look at linked lists, trees, and hash maps.
- I came across an interesting article written by the ICIJ in which they explain how they’re able to deal with massive data leaks like the Panama Papers and Paradise Papers. They explain how they’ve been able to overcome both collaboration and technical challenges, the latter by using Apache Tika (to extract metadata and text), Apache Solr (to build search engines), Tesseract (to turn images into text), Neo4j (to make sense of the data), and Linkurious (to visualize the data).
- Christopher Eyre has written a couple of blog posts in which he explains a utility he built that allows data stored in Contentful (a cloud hosted, headless, Content Management System) to be imported into Neo4j. You can find the code for the tool in the chriseyre2000/contentful-to-neo4j GitHub repository.
- Adam Cowley has written a blog post introducing Neode – an OGM he built by putting together a set of generic services on top of the official drivers to take care of the mundane CRUD operations.
- This week Adam release version 0.1.14 of Neode, This version has TypeScript support, temporal and spatial support. A massive thanks to Noumaan Shah for his contributions to this release.
Analysing the World Cup with Neo4j Bloom
I show how to describe some simple graph patterns, visually detect England’s hat trick scorers, and conclude by demonstrating how you can write your own custom Cypher queries and expose them as search phrases.
How deletes work in Neo4j
This article explains what happens when you delete nodes, relationships, and properties, and why, contrary to expectations, you will see the amount of space taken on the filesystem increase when doing bulk delete operations.
On the podcast: Matt Casters
This week on the podcast Rik interviewed Matt Casters, Chief Solutions Architect at Neo4j.
They talk about his experience building the Pentaho Kettle integration tool, his more recent work extending it to load data from streaming data sources (like Kafka) into Neo4j, as well as his new role in the solutions team at Neo4j.
You can listen to the interview below or read the transcript.
APOC YouTube Series: Load JSON, Load JDBC, Bulk loading data
This week Michael released 4 more videos in the Neo4j APOC YouTube series:
You can find a list of all the videos so far in the Neo4j APOC Utility Library HowTo Series playlist.
What’s happening next week in the world of graph databases?
July 11th 2018
Tweet of the Week
My favourite tweet this week was by Rune Sørensen:
Don’t forget to RT if you liked it too.
That’s all for this week. Have a great weekend!
About the Author
Mark Needham , Developer Relations Engineer
Mark Needham is a graph advocate and developer relations engineer at Neo4j.
As a developer relations engineer, Mark helps users embrace graph data and Neo4j, building sophisticated solutions to challenging data problems. Mark previously worked in engineering on the clustering team, helping to build the Causal Clustering feature released in Neo4j 3.1. Mark writes about his experiences of being a graphista on a popular blog at markhneedham.com. He tweets at @markhneedham.