LOGO

Why Graph ETL Is Different?


With a property graph of the type that Neo4j implements, ETL is like an RDBMS in terms of loading the nodes and putting the properties in the nodes. But with graphs, the ETL must also create edges (the way in which one node is connected to another). If you have a new node, the ETL must be able to recognize the other nodes the new node must be connected to. You also have to add property information to the edge itself.

This means you need an algorithm that searches through other nodes and discovers how to connect to the nodes you need the edges for. The point is that this process is not just about putting data in a table like an RDBMS — it is more complex.

Here are different ways graph vendors are supporting graph ETL using different approaches that each will a wide variety of use cases with a good fit.

How Neo4j’s Morpheus Project Supports More Automated Graph ETL and Graph Combinations

  • Neo4j’s Morpheus project is not aimed solely and making graph ETL better, but it has some powerful capabilities to make it easier to create both nodes and edges from a sophisticated mapping of SQL tables to a graph.
  • The Morpheus project brings sophisticated property graph support for in-memory operations using Spark. Morpheus extends the Cypher query language so queries can read from named graphs and graph views, and create new named graphs.
  • With Morpheus, you can do the work I just described, of mapping not only an RDBMS table to node properties, but also mapping tables to edges and filling in properties of the edges as well. This goes a long way to solving some of the problems of initially loading a graph from connected set of RDBMS tables.
  • Morpheus lets you further wrangle your graph data in memory in Spark, splitting and combining graphs, creating new temporary graph views and adding data into graphs. It lets you store the whole of a Spark graph into your regular Neo4j database, and will also allow a graph to be merged on top of an existing Neo4j database graph. This allows you to periodically snap RDBMS updates, map them into a “delta graph,” and merge them into a Neo4j transactional store. To complete that cycle, Morpheus can also take a graph view (defined by a Cypher query) over a subset of a Neo4j database.


  • Copyright 2018 Forbes Media LLC. All Rights Reserved. ALL MATERIALS AND SERVICES ON THE WEBSITE, OTHER CHANNELS, AND ANY THIRD-PARTY SITES TO WHICH THE FOREGOING LINKS ARE PROVIDED “AS IS” OR “AS AVAILABLE” WITHOUT WARRANTY OF ANY KIND. FORBES IS NOT RESPONSIBLE FOR THE AVAILABILITY OR CONTENT OF OTHER SERVICES THAT MAY BE LINKED TO THE WEBSITE OR OTHER CHANNELS. BECAUSE FORBES HAS NO CONTROL OVER SUCH SERVICES, YOU ACKNOWLEDGE AND AGREE THAT FORBES IS NOT RESPONSIBLE FOR THE AVAILABILITY OF SUCH EXTERNAL SERVICES, AND THAT FORBES DOES NOT ENDORSE AND IS NOT RESPONSIBLE OR LIABLE FOR ANY CONTENT, ACCURACY, QUALITY, ADVERTISING, PRODUCTS OR OTHER MATERIALS ON OR AVAILABLE FROM SUCH EXTERNAL SERVICES.

     

    Keywords: